How can I protect my website against bitsquatting?
Is there any standard defense technique against it?
As outlined in the other answers, bit errors when querying domain names may not be a realistic threat to your web application. But assuming they are, then Subresource Integrity (SRI) helps.
With SRI you're specifying a hash of the resource you're loading in an integrity
attribute, like so:
<script src="http://www.example.org/script.js"
integrity="sha256-DEC+zvj7g7TQNHduXs2G7b0IyOcJCTTBhRRzjoGi4Y4="
crossorigin="anonymous">
</script>
From now on, it doesn't matter whether the script is fetched from a different domain due to a bit error (or modified by a MITM) because your browser will refuse to execute the script if the hash of its content doesn't match the integrity value. So when a bit error, or anything else, made the URL resolve to the attacker-controlled dxample.org
instead, the only script they could successfully inject would be one matching the hash (that is, the script you intended to load anyway).
The main use case for SRI is fetching scripts and stylesheets from potentially untrusted CDNs, but it works on any scenario where you want to ensure that the requested resource is unmodified.
Note that SRI is limited to script
and link
for now, but support for other tags may come later:
Note: A future revision of this specification is likely to include integrity support for all possible subresources, i.e.,
a
,audio
,embed
,iframe
,img
,link
,object
,script
,source
,track
, andvideo
elements.
(From the specification)
(Also see this bug ticket)
Your concern is very likely unfounded.
First of all, you need to realize just how unlikely these memory malfunctions are. The person who wrote the above article logged requests to 32 clones of some of the most visited domains on the Internet over the course of 7 months. Those 52,317 hits had to be among hundreds of billions of requests. Unless you operate a website on the scale of Facebook, an attacker would have to be extremely lucky to even get just one unlucky victim on their bitsquatting domain.
Then you have to note that memory errors cause several malfunctions. The author writes:
These requests [...] show signs of several bit errors.
If the system of the victim is so broken that they can't even send a HTTP request without several bit errors, then any malware they download from it will likely not execute without errors either. It's a miracle it even managed to boot up in that condition.
And regarding those cases where bit errors were found in "web application caches, DNS resolvers, and a proxy server" and thus affecting multiple users (some of them maybe unlucky enough to get enough of the malware in an executable state): In these situations, the HTTP response would come from a different server than the client requested. So when you use HTTPS-only (which I assume you do, or you would have far more serious attacks to worry about), then their signature won't check out and the browser will not download that resource.
And besides, HTTPS will also make it much less likely to get a successful connection when there is a system with broken RAM on the route. A single bit-flip in a TLS encrypted message will prevent the hash from checking out, so the receiver will reject it.
tl;dr: stop worrying and set up HTTPS-only.
I doubt about the article's dependability. While discussed at DEFCON and published as whitepaper, I have serious concerns about the experimental results.
According to comment by @mootmoot, the author failed to determine deterministic programming errors from random fluctuations of bits.
My concerning statement is
During the logging period [Sept. 2010 / May 2011, ndr] there were a total of 52,317 bitsquat requests from 12,949 unique IP addresses
No, the author only proved his squat domains were contacted, but likely failed to provide additional information
- What percentage of original CDN network does that traffic represent (this is verifiable in theory, but I don't have those figures)
- Occurrence of source referral domains
The second is very important because it helps isolate deterministic programming failures from random fluctuation of bits.
Consider the following example: if you find any entry to gbcdn.net
(facebook squat) with referer https://facebook.com
you likely have found a bitsquat.
If on the contrary you find multiple entries from a poorly known webiste which you can inspect to find with a broken like button, then the problem is probably due to a programmer not copying/pasting the code correctly, or even a bit flip occurred in the programmer's IDE. Who knows...