Password entropy varies between different checks
Attempting to add to the other Connor's answer:
Something important to keep in mind is that entropy of course is, in essence, "the amount of randomness" in the password. Therefore, part of why different entropy checkers will disagree is because the entropy is a measure of how the password was generated, not what the password contains. An extreme example is usually the best way to show what I mean. Imagine that my password was frphevgl.fgnpxrkpunatr.pbzPbabeZnapbar
.
An entropy checker will probably rate that with a high amount of entropy because it contains no words and is long. It doesn't contain numbers, but taking a simple calculation (like what Connor outlined in his answer, and what most entropy calculators do), you might guess an entropy of 216 bits of entropy - far more than a typical password needs these days (38 characters with a mix of upper and lower case gives 52^38 ≈ 2^216).
However, looking at that, someone might suspect that my password isn't really random at all and might realize that it is just the rot13 transformation of site name + my name
. Therefore the reality is that there is no entropy in my password at all, and anyone who knows how I generate my passwords will know what my password is to every site I log in as.
This is an extreme example but I hope it gets the point across. Entropy is determined not by what the password looks like but by how it is generated. If you use some rules to generate a password for each site then your passwords might not have any entropy at all, which means that anyone who knows your rules knows your passwords. If you use lots of randomness then you have a high entropy password and it is secure even if someone knows how you make your passwords.
Entropy calculators make assumptions about how the passwords were generated, and therefore they can both disagree with eachother and also be wildly wrong. Also, XKCD is always applicable when it comes to these sorts of things.
Password entropy is calculated by the number of possibilities it could be to the power of the length ie. 8 character password of both upper and lowercase letters = (26*2)8, (26 characters of the alphabet * 2 for upper and lowercase).
If you include the numbers 0-9 as well, then that power becomes 26*2+10, and if you include special characters as well this number can become quite large for number of possibilities.
So as an example, we'll take a password QweRTy123456, (a horrible password I know). This password is 12 characters long, so the power is the number 12, it uses both uppercase, lowercase and numbers, so we have 6212. Which gives a total number of possibilities of 3.2262668e+21.
Now if we take that same password, but all in lowercase, ie. qwerty123456, the value we have is 3612, giving us a potential of 4.7383813e+18, still an enormous number but much smaller than using uppercase as well.
Password strength relies on two major factors, length and complexity. As an example I'll show a numeric PIN of 4 vs 8 characters to show the difference. So for a 4 digit pin number, 104 we have 10,000 possible combinations, and if we use an 8 digit pin, ie. 108 we have 100,000,000 possibilities. So by doubling the length of the password, we have increased the potential candidates 10,000-fold.
The second factor of passwords relies on complexity, ie. upper and lowercase, special characters, etc. I gave an example above to show how this increases quickly by using more character sets.
Just a final note, a password is not as strong as its potential, because a 12 character password could be someones place of birth, a pets name, etc. The contents of a password are also extremely important. The 4 random words model tends to be quite popular and secure, mandatory xkcd.
ConorMancone's answer gives an excellent explanation and example of the contents of a password vs the entropy, so i'd suggest giving that a read too for more info on this subject.
So in summary, take the number of possibilities from the character set, to the power of the length of the password, and divide by 2 for a reliable method of getting password strength based on brute-forcing techniques.
Hopefully this answers your question, if you have any more leave a comment and I'll update this to reflect your questions.
Password entropy is calculated by: Knowing (or guessing) the algorithm used to generate the password, and gathering in the number of different branch points used in generating the password you chose.
Let me give some examples:
Password: password
This isn't 26^8 (or 2^38) - because the algorithm wasn't "choose 8 random lowercase characters". The algorithm was: choose a single, very easy to remember word. How many such words are there? If you decide, "there are 200 such words", then you're looking at about 8 bits of entropy (not 38.)
Password: password6
Similar to the previous entry, this isn't 36^9 (or 2^47) - because the algorithm is choose a single, very easy to remember word, and then decorate it at the end with a single digit number. The entropy here is around 11 bits (not 47.)
Password: carpet#terraform2
By now you can guess what's going on. Two relatively uncommon words, with a punctuation character between them and a number digit at the end. If you estimate that those words were chosen from a dictionary of 10000 words (2^13), you're looking at something like 33 bits of entrophy (13 for the first word + 4 for the punctuation + 13 for the second word + 3 for the final digit.)
So, now, to answer your direct question: why do the various entropy checkers give different values?
Well, let's use that last password: carpet#terraform2.
One entropy-evaluator might say, "Hey, I have no clue how you generated this. So it must just be random characters among lowercase, punctuation, and numbers. Call it 52^17, or 97 bits of entropy (2^97.)"
Another, slightly smarter entropy-evaluator might say, "Hey, I recognize that first word, but that second string of letters is just random. So the algorithm is a single uncommon word, a punctuation, nine random letters, and then a number. So 10000 x 16 x 26^9 x 10, or 63 bits of entropy"
A third and fourth entropy-evaluator might correctly figure out the algorithm used to generate it. But the third evaluator thinks both words should come from a dictionary of 5000 words, but the fourth evaluator thinks you have to break into a 30,000 word dictionary to find them. So one comes up with 32 bits on entropy while the other thinks there are 37 bits.
Hopefully it's starting to make sense. The reason different entropy evaluators are coming up with different numbers is because they're all coming up with different evaluations on how the password was generated.