Which one is faster? List.contains() or Map.containsKey()

HashSet seems to be faster:

HashMap: 267
ArrayList: 2183
HashSet: 57

Also note that .contains() normally does not need to be called on HashMap and HashSet, but I kept it on the code to more accurately answer your question:

    long t = System.currentTimeMillis();
    HashMap<String, Boolean> map = new HashMap<>();
    for (int i = 0; i < 10000; i++) {
        String s = (Math.random() * 100) + "";
        if (!map.containsKey(s)) {
            map.put(s, true);
        }
    }
    System.out.println("HashMap: " + (System.currentTimeMillis() - t));

    t = System.currentTimeMillis();
    ArrayList<String> list = new ArrayList<>();
    for (int i = 0; i < 10000; i++) {
        String s = (Math.random() * 100) + "";
        if (!list.contains(s)) {
            list.add(s);
        }
    }
    System.out.println("ArrayList: " + (System.currentTimeMillis() - t));

    t = System.currentTimeMillis();
    HashSet<String> set = new HashSet<>();
    for (int i = 0; i < 10000; i++) {
        String s = (Math.random() * 100) + "";
        if (!set.contains(s)) {
            set.add(s);
        }
    }
    System.out.println("HashSet: " + (System.currentTimeMillis() - t));

Mapand List are interfaces, so there is no information on their implementation nor their performance. But if you use the most current implementations (LinkedList or ArrayList for List, and HashMap for Map), the contains() method must, in the worst case, go through the entire list, and compare your element with each entry. It is an O(n) operation.

If you use an HashMap, the implementation is radically different : the HashMap contains an array with more entries than elements in it (in practice, you have an array size of between 4n/3 an 3n/2 for n elements in the map). It computes the hash of the key, which is an int, and wrap it between 0 and your array size (let's say this number is i). Then it will put the element at the index i of the array (or i+1, i+2… if previous indexes are already taken). So, when you check for the key presence with containsKey, it will re-compute the hash and the i value, and check the i, i+1… indexes until it finds an empty array cell. Theorically, you can have an O(n) worst-case, if the array is almost full, are all the keys have almost identicals i values, but with a good hash function, you have constant-time contains and get functions. (However, adding elements is fast if you don't need to resize the array, which is REALLY slow - I think you need to recompute the indexes of each key).

So a map is really faster if you need to check the key appearance in a collection, and do not need to keep the order (there is a SortedHashMap for that, but I don't know it's performance), but it will take more memory.

Also, if you don't need the key-value thing, you can use a HashSet (which is internally the same as an HashMap).

I later realized that I don't really use the values contained in my Map so a List will suffice.

Map isn't just a list of key-value pairs, it is a unique mapping from keys to values. So when you change from Map to List, you are allowing duplicates where you previously didn't. On the other hand, a Set is exactly a Map without the values. So consider using a HashSet.

As for the search complexities:

list.contains is O(n), hashSet.contains is O(1), and treeSet.contains is O(log n).

For general information on now HashMap works, google for "hashtable". For TreeMap, google for "binary tree" or similar. Wikipedia has good entries on these subjects.

Be careful, however, to avoid the class Hashtable. It's an archaeological artefact in the modern library. For your case HashSet is probably the best choice.

Which one is faster? List.contains() or Map.containsKey()

Tags:

Java

Collections

List

Map

Related

Recent Posts