How does HashSet not allow duplicates?
See HashMap#put
:
Associates the specified value with the specified key in this map. If the map previously contained a mapping for the key, the old value is replaced.
It replaces the key with the new value, this way, no duplicates will be in the HashSet
.
PRESENT
is just a dummy value -- the set doesn't really care what it is. What the set does care about is the map's keys. So the logic goes like this:
Set.add(a):
map.put(a, PRESENT) // so far, this is just what you said
the key "a" is in the map, so...
keep the "a" key, but map its value to the PRESENT we just passed in
also, return the old value (which we'll call OLD)
look at the return value: it's OLD, != null. So return false.
Now, the fact that OLD == PRESENT
doesn't matter -- and note that Map.put
doesn't change the key, just the value mapped to that key. Since the map's keys are what the Set
really cares about, the Set
is unchanged.
In fact, there has been some change to the underlying structures of the Set
-- it replaced a mapping of (a, OLD)
with (a, PRESENT)
. But that's not observable from outside the Set
's implementation. (And as it happens, that change isn't even a real change, since OLD == PRESENT
).
As you can see the HashSet.add
method adds the element to the HashMap.put
as a key not as a value. Value is replaced in the HashMap
not the key.
The answer that you may be looking comes down to the fact that the backing hashmap maps the elements of the set to the value PRESENT
which is defined in HashSet.java as follows:
private static final Object PRESENT = new Object();
In the source code for HashMap.put
we have:
386 public V put(K key, V value) {
387 if (key == null)
388 return putForNullKey(value);
389 int hash = hash(key.hashCode());
390 int i = indexFor(hash, table.length);
391 for (Entry<K,V> e = table[i]; e != null; e = e.next) {
392 Object k;
393 if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
394 V oldValue = e.value;
395 e.value = value;
396 e.recordAccess(this);
397 return oldValue;
398 }
399 }
400
401 modCount++;
402 addEntry(hash, key, value, i);
403 return null;
404 }
Because the key in question already exists, we will take the early return on line 397. But you might think a change is being made to the map on line 395, in which it appears that we are changing the value of a map entry. However, the value of value
is PRESENT
. But because PRESENT
is static and final, so there is only one such instance; and so the assignment e.value = value
actually doesn't change the map, and therefore the set, at all!
Update:
Once a
HashSet
is initialized.
- All the items in it are stored as keys in aHashMap
- All the values thatHashMap
have ONLY ONE object that isPRESENT
which is a static field inHashSet