HTML5 syntax - HTML vs XHTML

The HTML5 draft is very clear about which syntax to use:

  • use HTML syntax when sending pages as text/html
  • use XHTML syntax when sending pages as application/xhtml+xml

Reference: http://dev.w3.org/html5/spec/Overview.html#authors-using-xhtml


I guess my true question is is there a reason to switch from XHTML to HTML syntax? I've been using XHTML for years and not sure if there is a reason to switch back. Browser compatibility (IE was sometimes finiky with the application/xhtml+xml mime-type), etc?

As mentioned in a previous answer, text/html is gets parsed as HTML and application/xhtml+xml gets parsed as XML. Thus, you should use the syntax that matches the MIME type you use.

If you are now serving text/html but using XHTML syntax, then you should fix your content to use the HTML5 syntax. You may already be close, since HTML5 allows the XMLesque /> empty element syntax for void elements (elements that are always empty, such as img and br).

If you are now using application/xhtml+xml, IE support would be a reason to switch to text/html and the HTML syntax if you care about supporting IE.

Trying to write polyglot documents that are correct HTML5 and XHTML5 (for serving different MIME types do different browsers with the same payload bytes) is harder than it seems at first sight and not worth the trouble.


When using XHTML you can mix it with other XML content, f.e. MathML, SVG or your own proprietary format, by just changing namespace at some point. Also, you can embed XHTML inside other XML documents.

(well, actually MathML and SVG can be used in non-XML HTML5 too, but they are special-cased)


The advantage of XHTML syntax is that it is XML. It can be easily parsed, understood and manipulated. The HTML syntax is a lot harder for clients to work with.

Nonsense! The HTML5 spec defines how to parse HTML in a way that is relatively easy to implement, and off-the-shelf parsers are being developed that can be easily integrated into tool chains. It's even possible for an HTML5 parser to be integrated into an XML tool chain in place of an XML parser.

But what you need to understand is that in practice, you're most likely using HTML anyway, even if you think you're using XHTML based on the DOCTYPE. If your content is being served as text/html, instead of application/xhtml+xml or another XML MIME type, then your content will be processed as HTML.

With HTML5, you can choose to use HTML-only syntax, meaning that it is only compatible with being served and processed as text/html it is not well-formed XML. Or use XHTML-only syntax, meaning that is is well-formed XML, but uses XML features that are not compatible with HTML. Or, you can write a Polyglot document, which is conforming and compatible with both HTML and XHTML processing (In principle, this is conceptually similar to writing XHTML 1.0 that conforms with Appendix C guidelines).

Tags:

Html

Xhtml