What is the specific reason to prefer bcrypt or PBKDF2 over SHA256-crypt in password hashes?
The main reason to use a specific password hashing function is to make life harder for attackers, or, more accurately, to prevent them from making their own life easier (when compared to that of the defender). In particular, the attacker may want to compute more hashes per second (i.e. try more passwords per second) with a given budget by using a GPU.
SHA-256, in particular, benefits a lot from being implemented on a GPU. Thus, if you use SHA-256-crypt, attackers will be more at an advantage than if you use bcrypt, which is hard to implement efficiently in a GPU.
See this answer for some discussion of bcrypt vs PBKDF2. Though SHA-256-crypt is not PBKDF2, it is similar enough in its performance behaviour on GPU, so the same conclusions apply.
Case for SHA-512 is a bit less clear because existing GPU are much better at using 32-bit integers than 64-bit, and SHA-512 uses mostly 64-bit operations. It is still expected that modern GPU allow more hashes per second than CPU (for a given budget) with SHA-512-crypt, which again points at bcrypt as the better choice.
SHA-2 family of hashes was designed to be fast. BCrypt was designed to be slow. Both are considered robust. With enough rounds or work-factor, either one can take longer than the other, but I would lean towards the one that was designed to be slow. (if server load is an issue, the Work Factor is adjustable)
Additionally, I would lean towards BCrypt because it is usually a Compiled implementation (C or C++).
The multi-round SHA can easily be implemented in high-level language, at least for the iteration, if not also for the hash itself. High level languages are less efficient for basic mathematical operations reducing the number of rounds your production hardware can complete per millisecond.
While both algorithms can be implemented in either high- or low-level languages, or a hybrid; in BCrypt the options available dictate that you are more likely to land on an efficient implementation. (puts you on a more even playing field with the attacker)
In regards to your specific example from the /etc/shadow
file, you are likely on only low-level (efficient) algorithms either way. (SHA or BCrypt) In this example I would suggest you consult the OS documentation to optimize the rounds (work factor) based on the speed of the hardware -vs- how strong you would like the hash to be.
scrypt (with a great enough work factor) has the added benefit of having extra RAM/Memory requirements (not just CPU), making it more GPU-resistant than SHA, BCrypt or PBKDF2.
Edit: Thanks to Thomas for pointing out that BCrypt is more GPU-resistant than SHA-2, and that SHA-2 and PBKDF2 are practically equivalent in this regard.
Note: I'm looking at this question after this edit was made, and taking it into account:
Note: I specifically mean the multi-round password hashes described by the linked documents and marked with the codes $5$ and $6$ in crypt hashes, not a single round of the plain SHA256 or SHA512 hash functions.
Looking at the long, 22-step algorithm in this link you provided, I'd rather flip a question around: why would you prefer to use this instead of PBKDF2 with HMAC-SHA2? Because, at least as presented:
- The definition of PBKDF2 looks much simpler. This is because it is more modular—it defers most of its work to an externally-supplied pseudo-random function. This is normally instantiated with HMAC, which in turn defers most of its work to an external hash function like SHA-1 or SHA-2.
- This means that the security of PBKDF2 should be easier to analyze.
In contrast, the algorithm in the document you provide lists a ton of steps whose motivation is harder to understand. For example:
11. For each bit of the binary representation of the length of the
password string up to and including the highest 1-digit, starting
from to lowest bit position (numeric value 1):
a) for a 1-digit add digest B to digest A
b) for a 0-digit add the password string
NB: this step differs significantly from the MD5 algorithm. It
adds more randomness.
It adds more randomness? How does it do this? Why does it this step exist at all—is SHA-2 not adding sufficient randomness? If SHA-2 isn't random enough, why use it in the first place? And doesn't this step introduce secret-dependent branching into the algorithm, raising the question of possible timing attacks against it?
I'm not by any means saying that the algorithms you link are insecure. It's just that:
- The work factor they introduce comes down to the same thing that PBDKF2--HMAC-SHA2 would do (a large number of SHA2 iterations);
- They look very broadly similar to what you'd have if you unrolled a PBKDF2-HMAC-SHA2 implementation, but with additional complexity whose purpose I don't understand;
- So at least as presented in those documents, I find it harder to gain confidence on their design than I do for PBKDF2.
EDIT: After I wrote all this I went and did a bit of research into this algorithm to try and understand it better. First, from the question's own "description" and "specification" links, we learn that the algorithm was derived from an older MD5-based one by making relatively minor modifications.
This older MD5-based algorithm appears the one that Poul-Henning Kamp wrote for FreeBSD-2.0 in 1994, which he no longer considers safe. In the first link (where he tells the history of the function), he mentions that glibc adopted his function as well. He also links to Provos and Mazières' 1999 paper on bcrypt and mentions that it expressed some disapproval, and funnily enough they highlighted the same step that caught my attention above:
MD5 crypt hashes the password and salt in a number of different combinations to slow down the evaluation speed. Some steps in the algorithm make it doubtful that the scheme was designed from a cryptographic point of view—for instance, the binary representation of the password length at some point determines which data is hashed, for every zero bit the first byte of the password and for every set bit the first byte of a previous hash computation.
But I think this explains the motivation of the newer functions that you ask about: they are a very minimal modification of an older function that predates most of the modern password hash functions, whose design has been called into question but is likely not fundamentally broken, just pointlessly complex.