counting occurrences in column and create variable in R
You were almost there! rle
will work very nicely, you just need to sort your table on ID
before computing rle
:
CT <- data.frame( value = runif(10) , id = sample(5,10,repl=T) )
# sort on ID when calculating rle
Count <- rle( sort( CT$id ) )
# match values
CT$Count <- Count[[1]][ match( CT$id , Count[[2]] ) ]
CT
# value id Count
#1 0.94282600 1 4
#2 0.12170165 2 2
#3 0.04143461 1 4
#4 0.76334609 3 2
#5 0.87320740 4 1
#6 0.89766749 1 4
#7 0.16539820 1 4
#8 0.98521044 5 1
#9 0.70609853 3 2
#10 0.75134208 2 2
data.table
usually provides the quickest way
set.seed(3)
library(data.table)
ct <- data.table(id=sample(1:10,15,replace=TRUE),item=round(rnorm(15),3))
st <- ct[,countid:=.N,by=id]
id item countid
1: 2 0.953 2
2: 9 0.535 2
3: 4 -0.584 2
4: 4 -2.161 2
5: 7 -1.320 3
6: 7 0.810 3
7: 2 1.342 2
8: 3 0.693 1
9: 6 -0.323 5
10: 7 -0.117 3
11: 6 -0.423 5
12: 6 -0.835 5
13: 6 -0.815 5
14: 6 0.794 5
15: 9 0.178 2
If you don't feel the need to use base R, plyr makes this task easy:
> set.seed(3)
> library(plyr)
> ct <- data.frame(id=sample(1:10,15,replace=TRUE),item=round(rnorm(15),3))
> ct <- ddply(ct,.(id),transform,idcount=length(id))
> head(ct)
id item idcount
1 2 0.953 2
2 2 1.342 2
3 3 0.693 1
4 4 -0.584 2
5 4 -2.161 2
6 6 -0.323 5