How does a Java HashMap handle different objects with the same hash code?

A hashmap works like this (this is a little bit simplified, but it illustrates the basic mechanism):

It has a number of "buckets" which it uses to store key-value pairs in. Each bucket has a unique number - that's what identifies the bucket. When you put a key-value pair into the map, the hashmap will look at the hash code of the key, and store the pair in the bucket of which the identifier is the hash code of the key. For example: The hash code of the key is 235 -> the pair is stored in bucket number 235. (Note that one bucket can store more then one key-value pair).

When you lookup a value in the hashmap, by giving it a key, it will first look at the hash code of the key that you gave. The hashmap will then look into the corresponding bucket, and then it will compare the key that you gave with the keys of all pairs in the bucket, by comparing them with equals().

Now you can see how this is very efficient for looking up key-value pairs in a map: by the hash code of the key the hashmap immediately knows in which bucket to look, so that it only has to test against what's in that bucket.

Looking at the above mechanism, you can also see what requirements are necessary on the hashCode() and equals() methods of keys:

If two keys are the same (equals() returns true when you compare them), their hashCode() method must return the same number. If keys violate this, then keys that are equal might be stored in different buckets, and the hashmap would not be able to find key-value pairs (because it's going to look in the same bucket).
If two keys are different, then it doesn't matter if their hash codes are the same or not. They will be stored in the same bucket if their hash codes are the same, and in this case, the hashmap will use equals() to tell them apart.

HashMap structure diagram

HashMap is an array of Entry objects.

Consider HashMap as just an array of objects.

Have a look at what this Object is:

static class Entry<K,V> implements Map.Entry<K,V> {
        final K key;
        V value;
        Entry<K,V> next;
        final int hash;
… 
}

Each Entry object represents a key-value pair. The field next refers to another Entry object if a bucket has more than one Entry.

Sometimes it might happen that hash codes for 2 different objects are the same. In this case, two objects will be saved in one bucket and will be presented as a linked list. The entry point is the more recently added object. This object refers to another object with the next field and so on. The last entry refers to null.

When you create a HashMap with the default constructor

HashMap hashMap = new HashMap();

The array is created with size 16 and default 0.75 load balance.

Adding a new key-value pair

Calculate hashcode for the key
Calculate position hash % (arrayLength-1) where element should be placed (bucket number)
If you try to add a value with a key which has already been saved in HashMap, then value gets overwritten.
Otherwise element is added to the bucket.

If the bucket already has at least one element, a new one gets added and placed in the first position of the bucket. Its next field refers to the old element.

Deletion

Calculate hashcode for the given key
Calculate bucket number hash % (arrayLength-1)
Get a reference to the first Entry object in the bucket and by means of equals method iterate over all entries in the given bucket. Eventually we will find the correct Entry. If a desired element is not found, return null

Your third assertion is incorrect.

It's perfectly legal for two unequal objects to have the same hash code. It's used by HashMap as a "first pass filter" so that the map can quickly find possible entries with the specified key. The keys with the same hash code are then tested for equality with the specified key.

You wouldn't want a requirement that two unequal objects couldn't have the same hash code, as otherwise that would limit you to 2³² possible objects. (It would also mean that different types couldn't even use an object's fields to generate hash codes, as other classes could generate the same hash.)

How does a Java HashMap handle different objects with the same hash code?

Adding a new key-value pair

Deletion

Tags:

Java

Hashcode

Hashmap

Hash Function

Related

Recent Posts