Is salting a hash really as secure as common knowledge implies?
You have a fundamental misconception of how rainbow tables work.
A rainbow table or a hash table is built by an attacker prior to an attack. Say I build a hash table containing all the hashes of strings below 7 characters for MD5
. If I compromise your database and obtain list of hashes, all I have to do is lookup the hash on the table to obtain your password.
With a salt, you cannot generate a rainbow table for a specific algorithm prior to an attack. A salt is not meant to be secret, you store it alongside the hash in your database.
x = hash(salt+password)
You will then store it in your database in the format of salt+x
This renders rainbow tables and hash tables useless.
As usual don't roll your own, use bcrypt
, scrypt
or pbkdf2
which takes care of all the details including salting for you. See How to securely hash passwords?
A rainbow table is an optimization for reversing hashes by brute force. It works by a trade-off: you do a lot of precomputation to build a huge data structure, and you can then crack many hashes quickly.
A rainbow table only helps the crack hashes in the search space that it covers. Concretely, rainbow tables are built for plaintexts made of printable characters and [up to a certain length. The basic principle of adding a salt is that the plaintext that is hashed contains both the password and the salt; with the added salt, the search space becomes too large to build a rainbow table.
Thus, by adding a salt to the password, there is no way to amortize the cost of a brute force attack over many cracks. The attacker has to do all the work for each hash, starting only when he knows the salt.
Given a hash and a salt, there is no way to “remove the salt”. This is a basic property of a hash function: even if you know that two strings are related (e.g. you know that password is a substring of password+salt), it doesn't help you find hash(password) knowing hash(password+salt). There's no need to hide the salt from the attacker: it needs to be unique (and not derived from the password) but it doesn't need to be more secret than the hash. (You can add an additional secret salt, which is called a pepper, but it's only useful in a narrow set of circumstances.)
Salting the hash is only half the battle. Another part of securely hashing passwords is that the hash function must be slow, because that hurts the cracker a lot more than the verifier. Don't roll your own, use a method that has been vetted by experts. Use bcrypt or PBKDF2 or scrypt. Implementing password storage should not involve doing any cryptography yourself, call a library function.
For everything you wanted to know about hashing passwords, everything you didn't want to know about hashing passwords, and everything you didn't even know you might know about hashing passwords, read How to securely hash passwords?
You are fundamentally correct that it is just making the password longer but there are some additional things that it does add. Also the average password is not that long or secure and adding length "for free" is rarely bad. The salt makes it so that an attacker can't hash "password" once and then look for every user that has a password of "password".
Instead, they would have to generate hashes of "password03j9834nfp-3028n408" and "passwordn0438wnvas89v5sne" and... This greatly increases the protection of identifying insecure passwords and still has a positive benefit on increasing the difficulty of finding secure passwords.
Where your understanding goes astray is when you say "Couldn't an attacker just build a rainbow table, then convert all the hashes and remove the salt?". There are a few problems here.
The first is that of collisions. A rainbow table is not guaranteed to produce the original input, it is only interested in producing AN input that produced the desired output. Since the salt will be added to whatever a user enters, the attacker must find the exact input that was used for the hash.
Since the exact input has to be found, the benefit of a rainbow table to leverage collisions is lost. Further, the size required to brute force all possible salts would be too large for pretty much any attacker to hold.
You also can't simply remove the salt from a hash without figuring out what the original input was. You would have to look for some value ending in the salt that corresponds to the given hash. This then requires a separate rainbow table to be produced for every salt value in the DB and this defeats the ability to pre-compute since it is infeasible to make a rainbow table for every possible salt.