Efficient binarization of a list
Edit:
@CarlWoll brought to my attention that RandomChoice
may produce unpacked arrays. This is why I update the timings. Previously, I stated that Carls' solutiuon need ten times as long as the first proposal below. This is not true for PackedArray
s. Mea culpa.
list = Developer`ToPackedArray[RandomChoice[{0.5, 0.3, 0.2} -> {1, 2, 3}, 200000]];
Another very efficient possibility is
RepeatedTiming[
list1a = Subtract[1,Unitize[Subtract[list,1]]];
list2a = Subtract[1,Unitize[Subtract[list,2]]];
list3a = Subtract[1,Unitize[Subtract[list,3]]];
][[1]]
0.00069
This is still a bit faster than @Carl's proposal:
RepeatedTiming[
list1b = Unitize@Clip[list, {1, 1}, {0, 0}];
list2b = Unitize@Clip[list, {2, 2}, {0, 0}];
list3b = Unitize@Clip[list, {3, 3}, {0, 0}];
][[1]]
0.00217
Other notable ways to do it:
Using SparseArray
(not so efficient):
{list1c,list2c,list3c} = Normal[SparseArray[
Transpose[{list,Range[Length[list]]}]->1,
{3,Length[list]}
]]; // RepeatedTiming // First
0.082
Another nice way is
{list1d, list2d, list3d} = Transpose[IdentityMatrix[3][[list]]]; // RepeatedTiming // First
0.0032
This is very fast and avoids an intermediate step of creating 1s, 2s and 3s:
mylist = RandomChoice[{.3, .2, .5} -> {{1, 0, 0}, {0, 1, 0}, {0, 0, 1}}, 300];
Transpose[mylist]
The generation of the list and the result is about twice as fast as the generation of the list and the result in the fastest method described by others here.
As @CarlWolf points out, using a PackedArray
speeds things too:
Timing[mylist =
Developer`PackedArray[
RandomChoice[{.3, .2, .5} -> {{1, 0, 0}, {0, 1, 0}, {0, 0, 1}},
10^8]];
Transpose[mylist];]
{4.04332, Null}
There are many possibilities, but I like using Clip
:
Unitize@Clip[list,{1,1},{0,0}]
Unitize@Clip[list,{2,2},{0,0}]
Unitize@Clip[list,{3,3},{0,0}]
{0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0}
{0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1}
{1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0}
Addendum
If your lists consist of just integers, than a compiled version is possible, e.g.:
fc = Last @ Compile[{{d, _Integer, 1}, {t,_Integer}},
Table[Boole[Compile`GetElement[d, i]==2], {i, Length@d}],
CompilationTarget->"C",
RuntimeOptions->"Speed"
];
Comparison:
lst = Developer`ToPackedArray @ RandomChoice[{.5,.2,.3}->{1,2,3}, 10^7];
r1 = fc[lst, 2]; //RepeatedTiming
r2 = BitXor[1, Unitize @ BitXor[2, lst]]; //RepeatedTiming
r1 === r2
{0.042, Null}
{0.0614, Null}
True