Server.UrlEncode vs. HttpUtility.UrlEncode
Fast-forward almost 9 years since this was first asked, and in the world of .NET Core and .NET Standard, it seems the most common options we have for URL-encoding are WebUtility.UrlEncode (under System.Net
) and Uri.EscapeDataString. Judging by the most popular answer here and elsewhere, Uri.EscapeDataString appears to be preferable. But is it? I did some analysis to understand the differences and here's what I came up with:
WebUtility.UrlEncode
encodes space as+
;Uri.EscapeDataString
encodes it as%20
.Uri.EscapeDataString
percent-encodes!
,(
,)
, and*
;WebUtility.UrlEncode
does not.WebUtility.UrlEncode
percent-encodes~
;Uri.EscapeDataString
does not.Uri.EscapeDataString
throws aUriFormatException
on strings longer than 65,520 characters;WebUtility.UrlEncode
does not. (A more common problem than you might think, particularly when dealing with URL-encoded form data.)Uri.EscapeDataString
throws aUriFormatException
on the high surrogate characters;WebUtility.UrlEncode
does not. (That's a UTF-16 thing, probably a lot less common.)
For URL-encoding purposes, characters fit into one of 3 categories: unreserved (legal in a URL); reserved (legal in but has special meaning, so you might want to encode it); and everything else (must always be encoded).
According to the RFC, the reserved characters are: :/?#[]@!$&'()*+,;=
And the unreserved characters are alphanumeric and -._~
The Verdict
Uri.EscapeDataString clearly defines its mission: %-encode all reserved and illegal characters. WebUtility.UrlEncode is more ambiguous in both definition and implementation. Oddly, it encodes some reserved characters but not others (why parentheses and not brackets??), and stranger still it encodes that innocently unreserved ~
character.
Therefore, I concur with the popular advice - use Uri.EscapeDataString when possible, and understand that reserved characters like /
and ?
will get encoded. If you need to deal with potentially large strings, particularly with URL-encoded form content, you'll need to either fall back on WebUtility.UrlEncode and accept its quirks, or otherwise work around the problem.
EDIT: I've attempted to rectify ALL of the quirks mentioned above in Flurl via the Url.Encode
, Url.EncodeIllegalCharacters
, and Url.Decode
static methods. These are in the core package (which is tiny and doesn't include all the HTTP stuff), or feel free to rip them from the source. I welcome any comments/feedback you have on these.
Here's the code I used to discover which characters are encoded differently:
var diffs =
from i in Enumerable.Range(0, char.MaxValue + 1)
let c = (char)i
where !char.IsHighSurrogate(c)
let diff = new {
Original = c,
UrlEncode = WebUtility.UrlEncode(c.ToString()),
EscapeDataString = Uri.EscapeDataString(c.ToString()),
}
where diff.UrlEncode != diff.EscapeDataString
select diff;
foreach (var diff in diffs)
Console.WriteLine($"{diff.Original}\t{diff.UrlEncode}\t{diff.EscapeDataString}");
HttpServerUtility.UrlEncode
will use HttpUtility.UrlEncode
internally. There is no specific difference. The reason for existence of Server.UrlEncode
is compatibility with classic ASP.
I had significant headaches with these methods before, I recommend you avoid any variant of UrlEncode
, and instead use Uri.EscapeDataString
- at least that one has a comprehensible behavior.
Let's see...
HttpUtility.UrlEncode(" ") == "+" //breaks ASP.NET when used in paths, non-
//standard, undocumented.
Uri.EscapeUriString("a?b=e") == "a?b=e" // makes sense, but rarely what you
// want, since you still need to
// escape special characters yourself
But my personal favorite has got to be HttpUtility.UrlPathEncode - this thing is really incomprehensible. It encodes:
- " " ==> "%20"
- "100% true" ==> "100%%20true" (ok, your url is broken now)
- "test A.aspx#anchor B" ==> "test%20A.aspx#anchor%20B"
- "test A.aspx?hmm#anchor B" ==> "test%20A.aspx?hmm#anchor B" (note the difference with the previous escape sequence!)
It also has the lovelily specific MSDN documentation "Encodes the path portion of a URL string for reliable HTTP transmission from the Web server to a client." - without actually explaining what it does. You are less likely to shoot yourself in the foot with an Uzi...
In short, stick to Uri.EscapeDataString.