What's the difference between HashSet and Set?
A Set
represents a generic "set of values". A TreeSet
is a set where the elements are sorted (and thus ordered), a HashSet
is a set where the elements are not sorted or ordered.
A HashSet
is typically a lot faster than a TreeSet
.
A TreeSet
is typically implemented as a red-black tree (See http://en.wikipedia.org/wiki/Red-black_tree - I've not validated the actual implementation of sun/oracle's TreeSet
), whereas a HashSet
uses Object.hashCode()
to create an index in an array. Access time for a red-black tree is O(log(n))
whereas access time for a HashSet
ranges from constant-time to the worst case (every item has the same hashCode) where you can have a linear search time O(n)
.
The HashSet
is an implementation of a Set
.
The question has been answered, but I haven't seen the answer to why the code mentions both types in the same code.
Typically, you want to code against interfaces which in this case is Set. Why? Because if you reference your object through interfaces always (except the new HashSet()) then it is trivial to change the implementation of the object later if you find it would be better to do so because you've only mentioned it once in your code base (where you did new HashSet()).