application/x-www-form-urlencoded and charset="utf-8"?

Note: that in step 2 of the above link it says: "Otherwise, let the selected character encoding be UTF-8." (see:http://www.w3.org/TR/html5/forms.html#application/x-www-form-urlencoded-encoding-algorithm.)

I also, believe this seems to indicate that it's a best practice for User agents to use UTF-8?

http://www.w3.org/TR/html40/appendix/notes.html#non-ascii-chars

Here's what it says: B.2.1 Non-ASCII characters in URI attribute values

Although URIs do not contain non-ASCII values (see [URI], section 2.1) authors sometimes specify them in attribute values expecting URIs (i.e., defined with %URI; in the DTD). For instance, the following href value is illegal:

...

We recommend that user agents adopt the following convention for handling non-ASCII characters in such cases:

Represent each character in UTF-8 (see [RFC2279]) as one or more bytes.
Escape these bytes with the URI escaping mechanism (i.e., by converting each byte to %HH, where HH is the hexadecimal notation of the byte value).

This procedure results in a syntactically legal URI (as defined in [RFC1738], section 2.2 or [RFC2141], section 2) that is independent of the character encoding to which the HTML document carrying the URI may have been transcoded.

Note. Some older user agents trivially process URIs in HTML using the bytes of the character encoding in which the document was received. Some older HTML documents rely on this practice and break when transcoded. User agents that want to handle these older documents should, on receiving a URI containing characters outside the legal set, first use the conversion based on UTF-8. Only if the resulting URI does not resolve should they try constructing a URI based on the bytes of the character encoding in which the document was received.

Note. The same conversion based on UTF-8 should be applied to values of the name attribute for the A element.


  1. There is no charset parameter defined for this media type.

  2. For the encoding guidelines, see https://url.spec.whatwg.org/#application/x-www-form-urlencoded .

The application/x-www-form-urlencoded standard implies UTF-8 and percent-encoding.

Though:

A legacy server-oriented implementation might have to support encodings other than UTF-8 as well as have special logic for tuples of which the name is _charset. Such logic is not described here as only UTF-8 is conforming.