Why doesn't encodeURIComponent encode single quotes/apostrophes?
encodeURIComponent
escapes all characters except the following:
alphabetic, decimal digits, - _ . ! ~ * ' ( )
If you wish to use an encoding compatible with RFC 3986 (which reserves !
, '
, (
, )
, and *
), you can use:
function rfc3986EncodeURIComponent (str) {
return encodeURIComponent(str).replace(/[!'()*]/g, escape);
}
You can get more information on this on MDN.
UPDATE:
To answer your question, on why '
and the other chars mentioned above are not encoded by encodeURIComponent, the short answer is that they only need to be encoded in certain URI schemes and the decision to encode them depends on the scheme you're using.
To quote RFC 3986:
URI producing applications should percent-encode data octets that correspond to characters in the
reserved set
unless these characters are specifically allowed by the URI scheme to represent data in that component. If a reserved character is found in a URI component and no delimiting role is known for that character, then it must be interpreted as representing the data octet corresponding to that character's encoding in US-ASCII.
Where "reserved set" is defined as
reserved = gen-delims / sub-delims
gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="
Apostrophe is in the sub-delims
group. In other words, you must leave these characters unencoded expecially if you are sure that consuming applications will know what to do with them: for example if you mistakenly encoded ?
and &
they will no longer delimit query parts. Historically there were also proposal for path segments parameters delimited with ;
and ,
(didn't get large adoption), so these characters are also still allowed,. It is not that apostrohe is "free to use" (ie unreserved
) in URI data, but that it was assumed it will have some special meaning in the URI context, for example the segment
part:
segment = *pchar
pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
try this
encodeURIComponent(str).replace(/'/g, "%27");
The /char/g
syntax tells JavaScript to replace all occurrences in your string