Should the hash code of null always be zero, in .NET
Bear in mind that the hash code is used as a first-step in determining equality only, and [is/should]never (be) used as a de-facto determination as to whether two objects are equal.
If two objects' hash codes are not equal then they are treated as not equal (because we assume that the unerlying implementation is correct - i.e. we don't second-guess that). If they have the same hash code, then they should then be checked for actual equality which, in your case, the null
and the enum value will fail.
As a result - using zero is as good as any other value in the general case.
Sure, there will be situations, like your enum, where this zero is shared with a real value's hash code. The question is whether, for you, the miniscule overhead of an additional comparison causes problems.
If so, then define your own comparer for the case of the nullable for your particular type, and ensure that a null value always yields a hash code that is always the same (of course!) and a value that cannot be yielded by the underlying type's own hash code algorithm. For your own types, this is do-able. For others - good luck :)
It doesn't have to be zero -- you could make it 42 if you wanted to.
All that matters is consistency during the execution of the program.
It's just the most obvious representation, because null
is often represented as a zero internally. Which means, while debugging, if you see a hash code of zero, it might prompt you to think, "Hmm.. was this a null reference issue?"
Note that if you use a number like 0xDEADBEEF
, then someone could say you're using a magic number... and you kind of would be. (You could say zero is a magic number too, and you'd be kind of right... except that it's so widely used as to be somewhat of an exception to the rule.)
So long as the hash code returned for nulls is consistent for the type, you should be fine. The only requirement for a hash code is that two objects that are considered equal share the same hash code.
Returning 0 or -1 for null, so long as you choose one and return it all the time, will work. Obviously, non-null hash codes should not return whatever value you use for null.
Similar questions:
GetHashCode on null fields?
What should GetHashCode return when object's identifier is null?
The "Remarks" of this MSDN entry goes into more detail around the hash code. Poignantly, the documentation does not provide any coverage or discussion of null values at all - not even in the community content.
To address your issue with the enum, either re-implement the hash code to return non-zero, add a default "unknown" enum entry equivalent to null, or simply don't use nullable enums.
Interesting find, by the way.
Another problem I see with this generally is that the hash code cannot represent a 4 byte or larger type that is nullable without at least one collision (more as the type size increases). For example, the hash code of an int is just the int, so it uses the full int range. What value in that range do you choose for null? Whatever one you pick will collide with the value's hash code itself.
Collisions in and of themselves are not necessarily a problem, but you need to know they are there. Hash codes are only used in some circumstances. As stated in the docs on MSDN, hash codes are not guaranteed to return different values for different objects so shouldn't be expected to.