Find pattern with Position
SequenceCases[a, {L, S}]
{{L, S}, {L, S}}
SequencePosition[a, {L, S}]
{{1, 2}, {4, 5}}
SequenceCount[a, {L, S}]
2
Edit:
Per default, Position
searches all levels of an expression while SequencePosition
does not. One can force Position
to focus on the first level to gain a speed-up. Still, SequencePosition
is a bit faster.
a = RandomChoice[{S, L}, 1000000];
ls = Position[Partition[a, 2, 1], {L, S}]; // MaxMemoryUsed // RepeatedTiming
ls2 = Position[Partition[a, 2, 1], {L, S}, 1, Heads -> False]; // MaxMemoryUsed // RepeatedTiming
ls3 = Flatten[SequencePosition[a, {L, S}]][[1 ;; ;; 2]]; // MaxMemoryUsed // RepeatedTiming
Flatten[ls] == Flatten[ls2] == ls3
{0.353, 96256488}
{0.283, 96256568}
{0.192, 32998448}
True
Apparently, SequencePosition
profits from using packed arrays while (Positions
does not in this case). So it is even faster to recode the dataset.
b = Developer`ToPackedArray@With[{L = 0, S = 1}, Evaluate[a]]; // MaxMemoryUsed // RepeatedTiming
pat = Developer`ToPackedArray[{0, 1}];
ls4 = Flatten[SequencePosition[b, pat]][[1 ;; ;; 2]]; // MaxMemoryUsed // RepeatedTiming
ls3 == ls4
{0.031, 16000320}
{0.090, 33021904}
True
Since a really like sparse arrays, here an even faster method using SpareArray
(notice that we need the recoded list b
from above):
ls5 = Flatten[SparseArray[UnitStep[Differences[b] - 1]]["NonzeroPositions"]]; // MaxMemoryUsed // RepeatedTiming
ls4 == ls5
{0.017, 16579184}
True
Pick
is essentially on par; sometimes it is a tick faster:
ls6 = Pick[Range[Length[b] - 1], Differences[b], 1]; // MaxMemoryUsed // RepeatedTiming
{0.015, 24000568}
True
One way to approach this would be to partition the sequence into pairs and then find the positions of the pairs:
ls = Position[Partition[a, 2, 1], {L, S}]
which shows that the two {L, S}s occur at the first and fourth positions. To find the separation between the occurrences:
Differences[ls]
To get the number of occurrences:
Length[ls]
If there are no occurrences, then you will get 0.