Current best practices to prevent persistent XSS attacks
Since you want current best practices and the latest answer here is August 2012, I thought I might as well weigh in and update this.
Best practises to prevent any type of XSS attack (persistent, reflected, DOM, whatever).
- Strictly validate all input. For example, if you're asking for a UK postcode ensure that only letters, numbers and the space character is allowed. Do this server-side and if validation fails, display a message to the user so that they can correct their input. Do this for all variables outside of your control, including query string, POST data, headers and cookies.
- Add yourself some security headers. Namely
. Edit 2021-01-28: This header is now deprecated due to browsers like Chrome discontinuing their inclusion of XSS auditors.X-XSS-Protection: 1; mode=block
to activate reflective XSS browser protection into blocking mode instead of filtering mode. Blocking mode stops attacks like thisX-Content-Type-Options: nosniff
to prevent JavaScript being inserted into images and other content types.Content-Security-Policy:
with strictscript-src
andstyle-src
's at least. Do not allowunsafe-inline
orunsafe-eval
. This is the daddy of headers for killing off XSS.
- Follow the rules in the OWASP XSS (Cross Site Scripting) Prevention Cheat Sheet when outputting values, however for rule #3 I'd do the following instead:
- Use HTML data attributes to output anything dynamic on the page.
- e.g.
<body data-foo="@foo" />
- Where
@foo
will output an HTML encoded version of the variable. e.g." />
would give<body data-foo="" />" />
- Grab these values out using JQuery or JavaScript:
var foo = $("body").data("foo");
- This way you don't need to worry about any double encoding, unless your JavaScript later inserts as HTML, however things are still simpler as you deal with the encoding there too instead of mixing it all together.
- Use a function like below to HTML encode if you're using
document.write
, otherwise you could introduce a vulnerability. Ideally though usetextContent
or JQuery'stext()
andattr()
functions.
Tackle these in reverse order. Concentrate on #3 as this is the primary mitigation for XSS, #2 tells the browser not to execute anything that slips through and #1 is a good defence-in-depth measure (if special characters can't get in, they can't get out). However, #1 is weaker because not all fields can be strictly validated and it can impair functionality (imagine Stack Exchange without the ability to allow "<script>
" as an input).
function escapeHtml(str) {
return String(str)
.replace(/&/g, "&")
.replace(/</g, "<")
.replace(/>/g, ">")
.replace(/"/g, """)
.replace(/'/g, "'")
.replace(/\//g, "/")
}
Escape quotes, filter out the word javascript, restrict the allowed letters, etc.
You might want to just read through the OWASP guidelines on XSS.
This additional page at OWASP should be useful to, it deals with the encoding issues in our discussions below.
Compose your HTML using an auto-escaping template language that, by default, escapes untrusted inputs for you.
For example, Django templates escape HTML special characters:
Clearly, user-submitted data shouldn't be trusted blindly and inserted directly into your Web pages, because a malicious user could use this kind of hole to do potentially bad things. This type of security exploit is called a Cross Site Scripting (XSS) attack.
To avoid this problem, you have two options:
- You can make sure to run each untrusted variable through the escape filter, which converts potentially harmful HTML characters to unharmful ones. This was the default solution in Django for its first few years, but the problem is that it puts the onus on you, the developer / template author, to ensure you're escaping everything. It's easy to forget to escape data.
- You can take advantage of Django's automatic HTML escaping. The remainder of this section describes how auto-escaping works. By default in Django, every template automatically escapes the output of every variable tag.
There are also a variety of contextual auto-escaped templates which are a bit smarter, and can prevent XSS even when your templates contain embedded JavaScript, CSS, and URLs.
Closure templates says:
Contextual autoescaping works by augmenting Closure Templates to properly encode each dynamic value based on the context in which it appears, thus defending against XSS vulnerabilities in values that are controlled by an attacker.
Go's template language uses contextual autoescaping:
HTML templates treat data values as plain text which should be encoded so they can be safely embedded in an HTML document. The escaping is contextual, so actions can appear within JavaScript, CSS, and URI contexts.