Why does ConcurrentHashMap prevent null keys and values?
I believe it is, at least in part, to allow you to combine containsKey
and get
into a single call. If the map can hold nulls, there is no way to tell if get
is returning a null because there was no key for that value, or just because the value was null.
Why is that a problem? Because there is no safe way to do that yourself. Take the following code:
if (m.containsKey(k)) {
return m.get(k);
} else {
throw new KeyNotPresentException();
}
Since m
is a concurrent map, key k may be deleted between the containsKey
and get
calls, causing this snippet to return a null that was never in the table, rather than the desired KeyNotPresentException
.
Normally you would solve that by synchronizing, but with a concurrent map that of course won't work. Hence the signature for get
had to change, and the only way to do that in a backwards-compatible way was to prevent the user inserting null values in the first place, and continue using that as a placeholder for "key not found".
Josh Bloch designed HashMap
; Doug Lea designed ConcurrentHashMap
. I hope that isn't libelous. Actually I think the problem is that nulls often require wrapping so that the real null can stand for uninitialized. If client code requires nulls then it can pay the (admittedly small) cost of wrapping nulls itself.
From the author of ConcurrentHashMap
himself (Doug Lea):
The main reason that nulls aren't allowed in ConcurrentMaps (ConcurrentHashMaps, ConcurrentSkipListMaps) is that ambiguities that may be just barely tolerable in non-concurrent maps can't be accommodated. The main one is that if
map.get(key)
returnsnull
, you can't detect whether the key explicitly maps tonull
vs the key isn't mapped. In a non-concurrent map, you can check this viamap.contains(key)
, but in a concurrent one, the map might have changed between calls.
You can't synchronize on a null.
Edit: This isn't exactly why in this case. I initially thought there was something fancy going on with locking things against concurrent updates or otherwise using the Object monitor to detect if something was modified, but upon examining the source code it appears I was wrong - they lock using a "segment" based on a bitmask of the hash.
In that case, I suspect they did it to copy Hashtable, and I suspect Hashtable did it because in the relational database world, null != null, so using a null as a key has no meaning.