What is the difference between decodeURIComponent and decodeURI?
To explain the difference between these two let me explain the difference between encodeURI
and encodeURIComponent
.
The main difference is that:
- The
encodeURI
function is intended for use on the full URI. - The
encodeURIComponent
function is intended to be used on .. well .. URI components that is any part that lies between separators (; / ? : @ & = + $ , #).
So, in encodeURIComponent
these separators are encoded also because they are regarded as text and not special characters.
Now back to the difference between the decode functions, each function decodes strings generated by its corresponding encode counterpart taking care of the semantics of the special characters and their handling.
js> s = "http://www.example.com/string with + and ? and & and spaces";
http://www.example.com/string with + and ? and & and spaces
js> encodeURI(s)
http://www.example.com/string%20with%20+%20and%20?%20and%20&%20and%20spaces
js> encodeURIComponent(s)
http%3A%2F%2Fwww.example.com%2Fstring%20with%20%2B%20and%20%3F%20and%20%26%20and%20spaces
Looks like encodeURI
produces a "safe" URI by encoding spaces and some other (e.g. nonprintable) characters, whereas encodeURIComponent
additionally encodes the colon and slash and plus characters, and is meant to be used in query strings. The encoding of + and ? and & is of particular importance here, as these are special chars in query strings.
encodeURIComponent/decodeURIComponent() is almost always the pair you want to use, for concatenating together and splitting apart text strings in URI parts.
encodeURI in less common, and misleadingly named: it should really be called fixBrokenURI. It takes something that's nearly a URI, but has invalid characters such as spaces in it, and turns it into a real URI. It has a valid use in fixing up invalid URIs from user input, and it can also be used to turn an IRI (URI with bare Unicode characters in) into a plain URI (using %-escaped UTF-8 to encode the non-ASCII).
Where encodeURI should really be named fixBrokenURI(), decodeURI() could equally be called potentiallyBreakMyPreviouslyWorkingURI(). I can think of no valid use for it anywhere; avoid.
As I had the same question, but didn't find the answer here, I made some tests in order to figure out what the difference actually is. I did this, since I need the encoding for something, which is not URL/URI related.
encodeURIComponent("A")
returns "A", it does not encode "A" to "%41"decodeURIComponent("%41")
returns "A".encodeURI("A")
returns "A", it does not encode "A" to "%41"decodeURI("%41")
returns "A".
-That means both can decode alphanumeric characters, even though they did not encode them. However...
encodeURIComponent("&")
returns "%26".decodeURIComponent("%26")
returns "&".encodeURI("&")
returns "&".decodeURI("%26")
returns "%26".
Even though encodeURIComponent does not encode all characters, decodeURIComponent can decode any value between %00 and %7F.
Note: It appears that if you try to decode a value above %7F (unless it's a unicode value), then your script will fail with an "URI error".