Counting subsequences (i.e., patterns) within a list?
data = {1, 1, 1, 2, 1, 2, 2, 2, 1, 2, 1};
sub = {{1, 1}, {1, 2}, {2, 1}, {2, 2}};
Count[Partition[data, 2, 1], Alternatives @@ sub]
10
Or each separately:
Count[Partition[data, 2, 1], #] & /@ sub
{2, 3, 3, 2}
More general, for different length subsequences:
Length[ReplaceList[data, {___, ##, ___} :> {}]] & @@@ sub
{2, 3, 3, 2}
Length[ReplaceList[data, {___, ##, ___} :> {}]] & @@@ {{1, 2, 1}, {1}}
{2, 6}
String processing is often useful for these problems if it can be applied.
data = {1, 1, 1, 2, 1, 2, 2, 2, 1, 2, 1};
pat = {{1, 1}, {1, 2}, {2, 1}, {2, 2}};
ts = ToString @ Row[#, ","] &;
StringCount[ts@data, ts /@ pat, Overlaps -> True]
10
Individually:
With[{ds = ts@data},
StringCount[ds, ts@#, Overlaps -> True] & /@ pat
]
{2, 3, 3, 2}
Or perhaps better:
StringCases[ts@data, ts@# -> # & /@ pat, Overlaps -> True] // Tally
{{{1, 1}, 2}, {{1, 2}, 3}, {{2, 1}, 3}, {{2, 2}, 2}}
Another way with ListCorrelate
used with proper arguments, for example :
data = {1, 1, 1, 2, 1, 2, 2, 2, 1, 2, 1};
sub = {{1, 1}, {1, 2}, {2, 1}, {2, 2}};
Total@Boole@
ListCorrelate[#, data, {1, -1}, "paddingUselessHere", Equal, And] & /@ sub
{2, 3, 3, 2}
Another example
Let's say you're just looking for the longer pattern {1,2,1}
:
Total@Boole@ListCorrelate[{1, 2, 1}, data, {1, -1}, "paddingUselessHere", Equal, And]
2
Explanation
It is easy to understand what ListCorrelate
does with this example :
ListCorrelate[{p1, p2}, {a, b, c, d}, {1, -1}, "whatever", f, g]
{g[f[p1, a], f[p2, b]], g[f[p1, b], f[p2, c]], g[f[p1, c], f[p2, d]]}
(When the 4th argument of ListCorrelate is {1,-1}
(default value), the 5th argument (padding) is useless, that's why i set it to "whatever".)