SCALA: Which data structures are optimal in which situations when using ".contains()" or ".exists()"?
Set
and Map
(with a default hash table implementation) are by far the fastest at contains
since they compute the hash value and jump to the right location immediately. For example, if you want to find an arbitrary string out of a list of a thousand, contains
on a set is about 100x faster than contains
on List
or Vector
or Array
.
With exists
, you really just care about how fast the collection is to traverse--you have to traverse everything anyway. There, List
is usually the champ (unless you want to traverse an array by hand), but only Set
and so on are usually particularly bad (e.g. exists
on List
is ~8x faster than on a Set
when each have 1000 elements). The others are within about 2.5x of List
(usually 1.5x, but Vector
has an underlying tree structure which is not all that fast to traverse).