The URL Encode tool converts a text string into a form suitable for inclusion in a URL. This form is called percent encoding or URL encoding. The URL Decode tool converts an encoded string back into a more user-friendly form.
- To encode text: Enter the text to be included in a URL parameter into the URL Encode box and click Go!
- To decode text: Enter the text to be included in a URL parameter into the URL Decode box and click Go!
When to Use It
A URL can contain parameters that include syntactic markers. However, if these markers are not encoded, the browser will parse them incorrectly. For example, let’s say that you have this parameter:
text=one two three.
If you include this in a URL, you’ll get
http://www.example.com?text=one two three. This link is non-functional since space indicates the end of the URL (eg, the server sees
http://www.example.com?text=one and ignores
URL encoding converts problematic characters, including whitespace, into a form that can be recognized by servers.
URL encoding can also be used for HTTP headers that set cookie values and for data included in HTTP POST requests.
Conversely, the URL Decode tool is useful when you have a body of text, including URLs, with encoded characters and you want to see what it looks like as plain text.
What It Does
The URL Encode tool takes a string and converts it to a URL-encoded format (also known as the percent-encoded format). It treats any non-ASCII characters as UTF-8.
The URL Decode tool takes a string with URL-encoded characters and converts it back to the un-encoded format.
A Deeper Look
The term URL encoding is commonly used, but it is slightly inaccurate. The encoding applies to the more general category of Uniform Resource Identifiers (URIs), including Uniform Resource Names (URNs) — not just URLs.
URL parameter values can include ASCII alphanumeric characters without difficulty. However, certain characters are “reserved,” and they have to be encoded to make sure that the server interprets the URL correctly. URL encoding shouldn’t be done anywhere except in parameter values. Characters are encoded by replacing it with a percent sign (%), followed by the appropriate two-digit hexadecimal string.
All characters other than ones deemed “safe” are replaced when a URL is encoded. The only safe characters are:
- Upper and lower case ASCII letters
- The following characters: $ – _ . + ! * ‘ ( ) ,
All other characters, including non-printing characters and anything outside of 7-bit ASCII, need to be encoded. Other characters can be encoded, but there’s no benefit — in some cases, there can even be some risk. If the server-side software isn’t expecting encoded characters and doesn’t decode them, it may not handle the provided input properly.
The URL encoding tool assumes it is getting either ASCII or UTF-8 text. However, the encoding itself doesn’t assume receipt of any specific character set. These are merely bytes that the server can interpret however it is programmed to do so. They can be interpreted as UTF-8, Latin-1, or anything else. Most websites today expect UTF-8, but there’s no requirement forcing them to do so. Some older websites will interpret URL parameters in Latin-1, Microsoft Windows-1252, and so on.
Encoding Other Types of Data
URL encoding can also be used for the domain name portion of a URL. However, W3C recommends using Punycode, which is optimized for domain names. Do not use URL Encode for domain names, however; this will convert the periods and render the URL useless.
All values from 0 (%00) to 255 (%FF) can be URL-encoded so that binary data can be passed in a URL parameter. Binary data doesn’t need to have every byte encoded; it’s more efficient to pass hex 41 as “A” rather than “%41.”
The URL Encode tool should be used only with single URL parameters. Providing a full URL (or even a parameter string) will encode special characters, such as “?” and “=”, making them ordinary text rather than part of the standard URL syntax.
However, when decoding, you can safely provide a full URL — the tool assumes that only parameters values have been encoded so that it will ignore other parts of the URL.
One drawback to URL encoding is that it adds to the URL’s length. There’s no official limit on how long a URL can be, but many consider 2000 characters to be the maximum safe length. Generally, you don’t know what parameter values will be sent, so you should allow for possible URL encoding when making estimates. If a URL with all of its parameters might exceed 2000 characters, consider making an HTTP POST instead of an HTTP GET request.
In HTTP POST requests, form fields are sent as data following the request, rather than as part of the URL. By default, they use the
application/x-www-form-urlencoded media type. This data looks like the query portion of a URL, but there’s no limit as to how long the data can be.
A Note of Interest
The first URL syntax specification came out in 1994 (RFC 1738), but it omitted a lot of uses that are commonly seen today. Over time, various workarounds have been used to support things like arbitrary URL parameters and international languages. One widely-used, but imperfect solution for such problems include URL encoding.