Which Java Collection should I use?
Since I couldn't find a similar flowchart I decided to make one myself.
This flow chart does not try and cover things like synchronized access, thread safety etc or the legacy collections, but it does cover the 3 standard Sets, 3 standard Maps and 2 standard Lists.
This image was created for this answer and is licensed under a Creative Commons Attribution 4.0 International License. The simplest attribution is by linking to either this question or this answer.
Other resources
Probably the most useful other reference is the following page from the oracle documentation which describes each Collection.
HashSet vs TreeSet
There is a detailed discussion of when to use HashSet
or TreeSet
here:
Hashset vs Treeset
ArrayList vs LinkedList
Detailed discussion: When to use LinkedList over ArrayList?
Even simpler picture is here. Intentionally simplified!
Collection is anything holding data called "elements" (of the same type). Nothing more specific is assumed.
List is an indexed collection of data where each element has an index. Something like the array, but more flexible.
Data in the list keep the order of insertion.
Typical operation: get the n-th element.
Set is a bag of elements, each elements just once (the elements are distinguished using their
equals()
method.Data in the set are stored mostly just to know what data are there.
Typical operation: tell if an element is present in the list.
Map is something like the List, but instead of accessing the elements by their integer index, you access them by their key, which is any object. Like the array in PHP :)
Data in Map are searchable by their key.
Typical operation: get an element by its ID (where ID is of any type, not only
int
as in case of List).
The differences
Set vs. Map: in Set you search data by themselves, whilst in Map by their key.
N.B. The standard library Sets are indeed implemented exactly like this: a map where the keys are the Set elements themselves, and with a dummy value.
List vs. Map: in List you access elements by their
int
index (position in List), whilst in Map by their key which os of any type (typically: ID)List vs. Set: in List the elements are bound by their position and can be duplicate, whilst in Set the elements are just "present" (or not present) and are unique (in the meaning of
equals()
, orcompareTo()
forSortedSet
)
Summary of the major non-concurrent, non-synchronized collections
Collection
: An interface representing an unordered "bag" of items, called "elements". The "next" element is undefined (random).
Set
: An interface representing aCollection
with no duplicates.HashSet
: ASet
backed by aHashtable
. Fastest and smallest memory usage, when ordering is unimportant.LinkedHashSet
: AHashSet
with the addition of a linked list to associate elements in insertion order. The "next" element is the next-most-recently inserted element.TreeSet
: ASet
where elements are ordered by aComparator
(typically natural ordering). Slowest and largest memory usage, but necessary for comparator-based ordering.EnumSet
: An extremely fast and efficientSet
customized for a single enum type.
List
: An interface representing aCollection
whose elements are ordered and each have a numeric index representing its position, where zero is the first element, and(length - 1)
is the last.ArrayList
: AList
backed by an array, where the array has a length (called "capacity") that is at least as large as the number of elements (the list's "size"). When size exceeds capacity (when the(capacity + 1)-th
element is added), the array is recreated with a new capacity of(new length * 1.5)
--this recreation is fast, since it usesSystem.arrayCopy()
. Deleting and inserting/adding elements requires all neighboring elements (to the right) be shifted into or out of that space. Accessing any element is fast, as it only requires the calculation(element-zero-address + desired-index * element-size)
to find it's location. In most situations, anArrayList
is preferred over aLinkedList
.LinkedList
: AList
backed by a set of objects, each linked to its "previous" and "next" neighbors. ALinkedList
is also aQueue
andDeque
. Accessing elements is done starting at the first or last element, and traversing until the desired index is reached. Insertion and deletion, once the desired index is reached via traversal is a trivial matter of re-mapping only the immediate-neighbor links to point to the new element or bypass the now-deleted element.
Map
: An interface representing anCollection
where each element has an identifying "key"--each element is a key-value pair.HashMap
: AMap
where keys are unordered, and backed by aHashtable
.LinkedhashMap
: Keys are ordered by insertion order.TreeMap
: AMap
where keys are ordered by aComparator
(typically natural ordering).
Queue
: An interface that represents aCollection
where elements are, typically, added to one end, and removed from the other (FIFO: first-in, first-out).Stack
: An interface that represents aCollection
where elements are, typically, both added (pushed) and removed (popped) from the same end (LIFO: last-in, first-out).Deque
: Short for "double ended queue", usually pronounced "deck". A linked list that is typically only added to and read from either end (not the middle).
Basic collection diagrams:
Comparing the insertion of an element with an ArrayList
and LinkedList
: