What's fastest way of defining 10^5 down values?
Summary: undocumented HashTable
is a bit faster (at least in version 9) both in storage and in retrieval than DownValues
.
DownValues
list = RandomInteger[{-10^9, 10^9}, 10^6];
ret = RandomInteger[{-10^9, 10^9}, 10^6];
isGood[___] = False;
Scan[(isGood[#] = True) &, list]; // AbsoluteTiming
(* ==> {3.240005, Null} *)
ClearAll[isGood];
isGood[___] = False;
Do[isGood@i = True, {i, list}]; // AbsoluteTiming
(* ==> {2.350003, Null} *)
On my computer, this takes less then 3 seconds for a million integers if Do
is used instead of Scan
. Isn't this fast enough?
Retrieval of the results is also quite quick, and is almost independent whether Table
or Map
is used:
isGood /@ ret; // AbsoluteTiming
(* ==> {1.410002, Null} *)
Table[isGood@i, {i, ret}]; // AbsoluteTiming
(* ==> {1.450002, Null} *)
HashTable
Out of curiosity, I compared this to the undocumented HashTable
(mentioned here) and got even better results. Note, that the hash table must be checked for existing value (as list
might contain duplicates) otherwise HashTableAdd
returns with error. Or it is even better to prefilter list
by removing duplicates, but that is omitted here not to bias the comparison.
hash = System`Utilities`HashTable[];
Do[If[
Not@System`Utilities`HashTableContainsQ[hash, i],
System`Utilities`HashTableAdd[hash, i, True] (* last argument can be omitted *)
], {i, list}]; // AbsoluteTiming
(* ==> {2.010003, Null} *)
System`Utilities`HashTableContainsQ[hash, #] & /@ ret; // AbsoluteTiming
(* ==> {1.340002, Null} *)
Table[System`Utilities`HashTableContainsQ[hash, i], {i, ret}]; // AbsoluteTiming
(* ==> {1.050001, Null} *)
We see that both storage and retrieval are a bit faster.
Are you sure you want to use UpValues
? You can use Dispatch
which is pretty fast when generating the lookup table and is equally fast when accessing values:
n = 6;
list = RandomInteger[{0, 10^(n + 1)}, {10^n}];
AbsoluteTiming[disp = Dispatch@Thread[list -> True];]
{1.6220927, Null}
Remove[isGood];
AbsoluteTiming[isGood[___] = False; Scan[(isGood[#] = True) &, list]]
{3.5982058, Null}
Query values:
test = RandomInteger[{0, 10^(n + 1)}, {10^n}];
AbsoluteTiming[Count[test /. disp, True]]
{1.9151096, 94844}
AbsoluteTiming[Count[isGood /@ test, True]]
{1.7601007, 94844}
This seems to be faster to define downvalues
list = RandomInteger[{-100000000, 100000000}, 1000000];
DownValues[isGood] =
HoldPattern[isGood[#]] :> True & /@ list; // AbsoluteTiming
isGood[___] = False;