Why is MD5 considered a vulnerable algorithm?
I know that MD5 is the most vulnerable hashing algorithm
Well technically (we are technical around here) there are worse algorithms than MD5.
and particularly vulnerable to Collisions
Yes, folks can create a desired hash with a different plaintext. This is not likely to happen randomly, but could occur maliciously.
But the collision vulnerability is not very risky and somebody might use that as an advantage, but that's with sheer luck.
Not sheer luck. There are techniques to find a plaintext that produces a desired MD5. That's a good subject for a different question.
OK, let's say I store passwords using MD5.
Ouch. The main reason you shouldn't use MD5 is because it is a General Purpose (Fast) Hash.
You should be using a (Slow) Password Hash such as
- BCrypt is commonly recommended, but be sure to run a quick SHA-2 hash on the input data, so super-long passwords will not be truncated by BCrypt
- PBKDF2 but that is less GPU-resistant because it has lower Memory requirements.
- SCrypt is better than BCrypt if you have a high enough work factor. Otherwise it is worse against GPUs. (again, because of higher or lower Memory requirements)
- The winner of the Password Hashing Competition may be even better than the aforementioned, but has not yet stood the test of time, so don't use it just yet. It's called Argon2, and has separate Work Factor settings for CPU time and Memory load. (nice!)
- Repetitive SHA-2 can be used instead of PBKDF2 (still not GPU resistant), but this is more tricky to implement the repetition efficiently (i.e. to be brute-force resistant) because SHA-2 is actually a General Purpose (Fast) Hash.
Most of these options generate random Salt by default, but you should verify whether this is the case!
It is best to include some Pepper (~72 bits of entropy) before the Password prior to hashing. The Pepper can be the same for all your users, but should be stored in a file outside of the database so that component cannot be found via SQL Injection.
Make sure your Work Factor requires about 100ms (with appropriate DoS protection) on your target hardware (knowing that attackers will use faster hardware for Brute force)
Of course no amount of hashing will protect weak password
s, so include password strength requirements.
collision vulnerability ... is there any way that the attacker can use this as an advantage?
In the context of password hash storage this probably will not help the attacker.