How can I create an enumeration variable by groups?
r1 = Join[{Append[First[data], "Count"]}, Join @@ Values @
GroupBy[Rest @data, First, MapIndexed[Join, #]&]]
{{ID,Value,Count},{1,48,1},{1,45,2},{1,52,3},{1,43,4},{1,41,5},{2,50,1},{2,42,2},{2,51,3},{2,52,4},{bb,52,1},{bb,54,2},{dd,20,1},{dd,25,2},{dd,27,3},{cc,30,1}}
Update 1:
Prepend[Join @@ (MapIndexed[Join, #]&/@ SplitBy[Rest @ data, First]),
Append[First[data], "Count"]]
{{ID,Value,Count},{1,48,1},{1,45,2},{1,52,3},{1,43,4},{1,41,5},{2,50,1},{2,42,2},{2,51,3},{2,52,4},{bb,52,1},{bb,54,2},{dd,20,1},{dd,25,2},{dd,27,3},{cc,30,1}}
Update 2:
addCounter = Module[{cnt}, cnt[_String] := "Count"; cnt[_] := 1; {##, cnt[#]++} & @@@ #]&;
addCounter @ data
{{"ID", "Value", "Count"}, {1, 48, 1}, {1, 45, 2}, {1, 52, 3}, {1, 43, 4}, {1, 41, 5}, {2, 50, 1}, {2, 42, 2}, {2, 51, 3}, {2, 52, 4}, {bb, 52, 1}, {bb, 54, 2}, {dd, 20, 1}, {dd, 25, 2}, {dd, 27, 3}, {cc, 30, 1}}
Here's one possibility using Split
(which assumes that the IDs being counted always appear in runs)
splitcount =
Transpose[
Flatten[{Transpose@#,
{Flatten[{
"Count", Range /@ Length /@ Split[#[[2 ;;, 1]]]
}]}
}, 1]
] &;
To do some time trials on longer data lists (though not as long as the one you're looking at), first build some data:
SeedRandom[123]
idlist = Flatten[
ConstantArray[#, RandomInteger[{1, 100}]] & /@ Range[10000]];
vallist = RandomInteger[{1, 60}, Length@idlist];
data = Join[{{"ID", "Value"}}, Transpose[{idlist, vallist}]];
Length@data
(* 507939 *)
Then
AbsoluteTiming[
res1 = splitcount[data];
]
AbsoluteTiming[
(* @kglr *)
res2 = Join[{Append[First[data], "Count"]},
Join @@ Values@GroupBy[Rest@data, First, MapIndexed[Join, #] &]];
]
AbsoluteTiming[
(* @kglr *)
res3 = Prepend[
Join @@ (MapIndexed[Join, #] & /@ SplitBy[Rest@data, First]),
Append[First[data], "Count"]];
]
AbsoluteTiming[
(* @kglr *)
res4 = addCounter@data;
]
AbsoluteTiming[
(* @JasonB. *)
keys = AssociationThread[Union[data[[2 ;;, 1]]] -> 0];
tally = Join[{"Count"}, keys[#] += 1 & /@ data[[2 ;;, 1]]];
res5 = MapThread[Append, {data, tally}];
]
res1 == res2 == res3 == res4 == res5
(* {0.416703, Null}
{0.739995, Null}
{1.16564, Null}
{1.81539, Null}
{1.93773, Null}
True *)
I can't speak to the efficiency of this, you'll have to try it on your dataset,
keys = AssociationThread[Union[data[[2 ;;, 1]]] -> 0];
tally = Join[{"Count"}, keys[#] += 1 & /@ data[[2 ;;, 1]]];
data2 = MapThread[Append, {data, tally}]
(* {{"ID", "Value", "Count"}, {1, 48, 1}, {1, 45, 2}, {1, 52,
3}, {1, 43, 4}, {1, 41, 5}, {2, 50, 1}, {2, 42, 2}, {2, 51, 3}, {2,
52, 4}, {bb, 52, 1}, {bb, 54, 2}, {dd, 20, 1}, {dd, 25, 2}, {dd, 27,
3}, {cc, 30, 1}} *)