Why should XSS filters escape forward slash?
Back before HTML 5 came with a standard parsing algorithm, HTML 4 was defined based on SGML. And its SGML-based syntax had features that differed in terms of browser support and differed in terms of people being aware of them.
One of the more obscure features that you'll be interested in, for the purposes of this question, is Null End Tag (NET). Have a look at the following code for an HTML page:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<title/hello/
<body>
<p/hello/
If you want, try putting it through an HTML validator.
NET specifies that code like <TAGNAME/text/
is parsed into the same thing you'd get from <TAGNAME>text</TAGNAME>
.
From here, you may be able to see why it's a good idea to escape the /
character when interpolating user-provided data into markup. A /
could potentially end up in one of these NET constructions and be interpreted as the end tag, instead of a literal solidus.
Sure no browser now understands the NET syntax, but it's part of the spec, so it's prudent to account for it.
Your question asks "Why /
?".
However you don't seem to be concerned about >
, "
and '
?
Technically, to ensure security within element content you only need to encode the <
and &
characters because HTML tags cannot start with >
, "
, '
or /
. Note that I am only talking about HTML here (not XHTML).
To answer your question the reason is that /
is an HTML character with special meaning. If you are following the OWASP guidelines it is assumed that you want your system to be secure and it is always best to err on the side of caution since there are little downsides to doing so in this case.
If you want to be minimalistic check out the OWASP XSS Experimental Minimal Encoding Rules.
When a character has syntactical meaning in any context, you should be wary, and err on the side of escaping it.
If for instance an attacker has found a way to inject it at the end of some existing tag, they can create a tag like <br />
which does not require ending; the transformation could change the meaning of the document.
I apologize that I don't have a concrete example for you beyond that, but it costs very little to escape it, so the tradeoff is certainly in favour of doing so.
I see no particular reason for it besides a safeguard against other bugs or mistakes being rendered exploitable by this particular inclusion. It isn't an especially strong recommendation in that no PoC can be universally developed to show its necessity, but there is no harm whatsoever in following it.