Fast way to find if a string is in an array
From @nnnnnn in the comments:
Convert the array into an object, something like this:
object = {}
array.forEach(function(string) { // Not cross-browser compatible, it's just an example
object[string] = 1;
}
Then perform lookups like this:
if (string in object) {
First of all, there is some fundamental confusion here about what data structures are available in JavaScript.
TL;DR
If you want fastest key lookup for objects with short prototype inheritance chain use
in
.If you want the same, but for the objects with extensive inheritance chain, use
Object.prototype.hasOwnProperty
If you want the fastest value lookup, use
Array.prototype.indexOf
forArray
.There isn't a built in function for value lookup in hash-tables. You can, of course, roll your own, but there are many libraries that provide one already. For example, Underscore provides one (it calls it
indexOf
).
JavaScript has no arrays
Fundamentally, JavaScript has only hash-tables. The standard Array
function constructs hash-tables (I will call these integer hash-tables, or int-hash-tables) where the keys are integers in addition to string keys. These perform similarly to arrays, but they differ in certain ways. There are cons and pros. For example, deleting an element from int-hash-table is an O(1) operation, while deleting an element from array is an O(n) operation (because you need to copy the rest of the elements into the new array). This is why Array.prototype.splice
function in JavaScript is very fast. The downside is the complexity of implementation.
So, when you say Array
in JavaScript context it is understood as int-hash-table, and all the asymptotic complexity associated with it. This means that if you wanted to find a string value inside an int-hash-table, then it would be an O(n) operation. There is a standard function for doing it: Array.prototype.indexOf
. However, if you wanted to look for the key, then there are two functions: in
and Object.prototype.hasOwnProperty
.
Somewhat counterintuitively:
[1, 2, 3].hasOwnProperty(0); // true
0 in [1, 2, 3]; // true
The difference between the two needs further explaining. It is related to the fact that all things in JavaScript are objects, and thus they have some object-y features. One such features the prototype
, the link between the object and it's prototype. It is a hierarchical structure of hash-tables, each containing properties of objects.
in
looks up the immediate hash-table of the object and then recursively searches the hash-tables of the prototypes of this objects.Whereas
Object.prototype.hasOwnProperty
only looks into the immediate hash-table. You might think it should be faster, but wait jumping to conclusions.
Due to the dynamic nature of JavaScript all function calls are dynamic and the environment must take a lot of care to ensure fail-safe code execution. This means that in JavaScript function calls are very expensive. So, going through Object.prototype.hasOwnProperty
may be a lot more expensive then going through in
, even though theoretically it should be the opposite. However, given tall-enough inheritance tree and enough of inherited properties, eventually, Object.prototype.hasOwnProperty
will take over.
Some examples to get a better intuition:
>>> var array = [1, 2, 3];
undefined
>>> 3 in array;
false
>>> array.hasOwnProperty(3);
false
>>> 3 in array;
false
>>> array.__proto__ = [1, 2, 3, 4];
[1, 2, 3, 4]
>>> 3 in array;
true
>>> array.hasOwnProperty(3);
false