Prevent denial of service attacks against slow hashing functions?
One possibility is to consider using client puzzles. The basic idea is to force the client to do a significant amount of work, and prove it has done so, before you will accept a username/password pair and try to validate it.
Here is an overview of the approach. The server generates a random puzzle, and sends the puzzle to the client. The server generates the puzzle in a way that it can predict reasonably well how much effort will be required to solve the puzzle (e.g., 100ms of computation). The client solves the puzzle, and then sends the solution along with the user's username and password.
In a web setting, this would probably be implemented with Javascript: the login page would contain the Javascript code to solve the puzzle and the puzzle description. A legitimate user's web browser would run the Javascript on the page, which would solve the puzzle and include the solution in the form along with the username and password.
This makes it harder for an attacker to DOS the server. For instance, if it takes the server 100ms of computation to check the validity of the user's password, and if the server chooses a puzzle that takes 100ms to solve, then an attacker will have to do at least as much work as the server. If you turn up the dial to select puzzles that will take 200ms to solve, then now the attacker has to do twice as much work as you, to DOS you. Thus, this does not prevent DOS, but it makes it harder and requires the attacker to invest more resources -- which may reduce the risk somewhat.
Here's a very simple example of how to construct a puzzle. The description of the puzzle is a random 128-bit number R (chosen by the server, and sent to the client). The goal is to find a number n such that SHA256(R, n) is zero in its low 20 bits. The client can loop over values of n (test whether n=0 is a solution, test whether n=1 is a solution, test whether n=2 is a solution, ...) until the client finds a solution. On average, you should expect this to take about 220 trials. If the client can perform about ten million hashes per second, then 220 trials should take about one-tenth of a second (100ms). Once the client finds a solution, the client can send n to the server. Note that the server can quickly verify that n is a valid solution to the puzzle (it takes only one hash, not a million hashes). This creates the required assymetry, where the server can very quickly generate new puzzles and verify claimed puzzle solutions, but where it takes longer to solve the puzzle, and where the server can control how long puzzle-solving takes with a fair degree of accuracy. This is just one example of a puzzle construction; there has been a great deal of research on this problem, and there are better constructions.
You might be especially interested in kaPoW, which implements this idea for the web context.
The main limitation is that this is not going to be secure against dedicated attack. A serious attacker can just acquire a botnet of a few hundred machines (this is not that expensive) and DOS you, and now you're completely hosed. Client puzzles can't stop that kind of attack: they can only stop a relatively low-grade attack. However, stopping DOS is in general very difficult.
A secondary limitation has to do with the broad variability of computational power of clients that might access your website. A puzzle that a high-end desktop can solve in 100ms might take a mobile device 1000ms to solve (I'm making up numbers here). If you choose parameters that make the puzzle relatively easy to solve on the lowest-power device that is ever likely to access your website, then you may find that security against high-end desktops is degraded. Also, it seems plausible that a C program (run by an attacker) might be able to solve the puzzle much faster than the Javascript (run by legitimate users), which would further exacerbate the gap between how fast an attacker can solve puzzles vs. how fast the slowest legitimate client can solve them. This gap degrades security.
A tertiary limitation is that client puzzles require the client to use Javascript, to log in. However this might be acceptable.
A variant of this idea is for the server to monitor itself and check whether it is under heavy load. Under normal conditions, skip the puzzle. However, if the server seems to be under DOS attack, then start requiring login attempts to come with a puzzle solution. This mitigates some of the disadvantages of the client puzzle approach, but still is far from perfect, and still cannot withstand a serious DOS attack.
I describe this potential solution, in case you interested, but I don't really recommend it for most sites. For most sites, the risk of DOS is relatively low, and if it does occur, you can deal with it at the time. For that reason, my guess is that it probably isn't worth your time or energy to implement these kinds of defenses. If you've got that much extra time and energy to spare, you might be better off spending it building new features or making your site better.
Reducing the number of guesses per minute is the way to go, but implementing it as a sleep
in your code will still consume precious resources and lead to a denial-of-service.
Instead, you want the client to wait, not the server. So log the time the attempt was made, add X
seconds to that time, and don't attempt further authentications from that user/IP until after that time has expired. (Edit, added:) That is, if an authentication attempt is received, reject it with an error message without even checking the password.
You'll want to put some corresponding javascript in the login page so that the user won't submit a new login attempt until that timeout has expired (plus a safety margin). That keeps users from seeing unnecessary error messages and getting confused.
Increasing the timeout with each attempt is also a good idea. This helps to keep from inconveniencing "normal" users, while making a brute-force attack practically impossible.
Another approach you could consider is to use CAPTCHAs to make this kind of DOS attack harder. The basic idea is to require the user to solve a CAPTCHA when they want to log in, and require them to present a correct solution to the CAPTCHA before you will validate their password.
This is unfriendly to users, so if you want to consider this, I would recommend a variant. Have your server monitor its own load, so it can detect when it is under attack. (For instance, measure how much of your CPU time you are spending on bcrypt-hashing passwords; or just measure the load on the server.) Under normal conditions, do nothing special: users just need to present a username and password, as usual, with no CAPTCHA anywhere.
When you detect you might be under attack, enter self-protection mode. In this mode, the server should send a CAPTCHA in the login page, asking users to enter in their username, enter their password, and enter the solution to the CAPTCHA. When the user clicks submit, the server should first validate the solution to the CAPTCHA before using bcrypt to verify the user's password. If the user didn't solve the CAPTCHA properly, don't even try to verify their password.
This raises the bar against DOS attacks, because now an attacker cannot simply send 10 invalid username/password pairs per second: the attacker will have to also solve 10 CAPTCHAs per second to DOS the server. As a result, the trivial DOS attack no longer succeeds. Also, this provides DOS resistance, while ensuring that in most situations users never need to see a CAPTCHA: users only need to solve CAPTCHAs when the server is under attack.
Of course, you can combine this scheme with rate-limiting by IP address and by username, if you wish.
That said, the security of this scheme is far from perfect. There are underground markets used by attackers that exist to automate the process of solving CAPTCHAs for you, at a fee. These services provide an API that their customers can use to submit a CAPTCHA and get back the solution. The going rate is something like a few dollars per thousand CAPTCHAs solved. Based upon this, we can compute what it would cost to mount a successful DOS attack against this scheme, by using these CAPTCHA-solving services. An attacker would need to solve 864,000 CAPTCHAs per day (10 per second) to take down your server, which would cost a few thousand dollars per day of downtime using existing markets. This is an upper bound on the cost of mounting a successful DOS attack. However, it almost certainly greatly over-estimates the cost of DOSing your server, as there will almost certainly be other, far more economical ways of mounting a DOS attack on your server.
Anyway, I share this potential solution based upon use of CAPTCHAs, but I don't really recommend it for most sites. For most sites, the risk of DOS is relatively low, and if it does occur, you can deal with it at the time. For that reason, my guess is that it probably isn't worth your time or energy to implement these kinds of defenses. If you've got that much extra time and energy to spare, you might be better off spending it building new features or making your site better.
Currently, DDOS is the simplest and cheapest way to take down your server. I have seen estimates for the cost of DDOS attack in the range of $100 per day, though it may cost significantly more to defeat a larger, better-resourced target.
Protecting your server against DOS attack is not easy; the economics of it are not working in your favor. Perhaps one of the best things you can do is arrange for your site to have a lot of excess capacity, and to make sure it can scale up as traffic grows (perhaps using a cloud service with VMs and network capacity on demand). The advantage of this approach is that it is beneficial even if you never come under DOS attack, because it will help you avoid getting overwhelmed by a flood of legitimate users -- e.g., if you get Slashdotted.