💡 小贴士:URL 编码的核心是先将字符转换为 UTF-8 字节,再进行百分号编码。

Quick String Examples

Click the examples below to quickly experience different types of string encoding

💡 Tip: Click any example to quickly fill in, supports Chinese, Unicode, Emoji and other character types

URL Encoding and Best Practices

URL encoding (URL Encoding), also known as Percent Encoding, is an encoding mechanism used in URLs consisting of % followed by two hexadecimal digits. See Percent Encoding.

Unreserved Characters

The following characters are unreserved characters:

'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_.~'

It is recommended that all URIs must not percent-encode unreserved characters, and other characters should be converted to UTF-8 first and then percent-encoded byte by byte.

Encoding Rules

  1. Unreserved characters remain unchanged
  2. Other characters are first converted to UTF-8 encoding
  3. Each byte of UTF-8 encoding is converted to percent encoding

Conversion Process

For example, 山月 (Chinese characters):

  1. In Unicode To UTF-8, we can see that the UTF-8 encoding of 山月 is E5 B1 B1 E6 9C 88
  2. Add percent signs to each encoded UTF-8 byte: %E5%B1%B1%E6%9C%88

API

Note the different handling of reserved characters ! ( etc. in various language APIs

JavaScript

// => '%E5%B1%B1%E6%9C%88'
encodeURIComponent('山月')

// => '山月'
decodeURIComponent('%E5%B1%B1%E6%9C%88')

// => '(!'
encodeURIComponent('(!')

Python

from urllib.parse import quote, unquote

# => '%E5%B1%B1%E6%9C%88'
quote('山月')

# => '山月'
unquote('%E5%B1%B1%E6%9C%88')

# => '%3F%21'
quote('?!')

URL Encoding Best Practices

  1. Always encode user input: Don't assume user input contains only safe characters.
  2. Use the correct encoding function: Encoding functions in different languages may have subtle differences, choose the one that suits your needs. For example, in JavaScript, there's a difference between encodeURIComponent and encodeURI.
  3. Pay attention to encoding scope: Some characters (like /) have different meanings in different parts of URLs, decide whether to encode based on context.
  4. Avoid double encoding: Decoding and then encoding again may lead to unexpected results.
  5. Consider internationalization: Ensure your application can correctly handle various languages and character sets.
  6. Test edge cases: Test inputs containing various special characters and non-ASCII characters.
  7. Follow RFC standards: Refer to RFC 3986 for more details.

Related Tools

Connection to UTF-8

URL encoding is closely related to UTF-8 encoding. When URL encoding non-ASCII characters:

  1. First step: Convert the character to UTF-8 bytes
  2. Second step: Apply percent encoding to each UTF-8 byte

Understanding UTF-8 encoding helps you better understand why URL encoding produces specific results. For example, the Chinese character becomes %E5%B1%B1 because:

  • in UTF-8 is the bytes E5 B1 B1
  • Each byte gets a % prefix: %E5%B1%B1

Use our UTF-8 encoding tool to see the detailed conversion process!

Conclusion

URL encoding is an indispensable part of web development. Properly understanding and applying URL encoding can help you build more robust and secure applications. By following the best practices mentioned in this article, you can avoid many common issues related to URL encoding and improve the reliability and user experience of your applications.