Calculating cumulative sum for each row
You can also try mySum = t(apply(df, 1, cumsum))
.
The transpose is in there because the results come out transposed, for a reason I have not yet determined.
I'm sure there are fine solutions with plyr
, such as ddply
and multicore methods.
You want cumsum()
df <- within(df, acc_sum <- cumsum(count))
To replicate the OP's result, the cumsum
function is all that is needed, as Chase's answer shows. However, the OP's wording "for each row" possibly indicates interest in the cumulative sums of a matrix or data frame.
For column-wise cumsums of a data.frame, interestingly, cumsum
is again all one needs! cumsum
is a primitive that is part of the Math
group of generic functions, which is defined for data frames as applying the function to each column; inside the code, it just does this : x[] <- lapply(x, .Generic, ...)
.
> foo <- matrix(1:6, ncol=3)
> df <- data.frame(foo)
> df
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
> cumsum(df)
X1 X2 X3
1 1 3 5
2 3 7 11
Interestingly, sum
is not part of Math
, but part of the Summary
group of generic functions; for data frames, this group first converts the data frame to a matrix and then calls the generic, so sum
returns not column-wise sums but the overall sum:
> sum(df)
[1] 21
This discrepancy is (in my opinion) most likely because cumsum
returns a matrix of the same size as the original, but sum
would not.
For row-wise cumulative sums, there not a single function that replicates this behavior that I know of; Iterator's solution is probably one of the most straightforward.
If speed is an issue, it would be almost certainly be fastest and most foolproof to write it in C; however, it speeds up a little (~2x ?) for long loops by using a simple for loop.
rowCumSums <- function(x) {
for(i in seq_len(dim(x)[1])) { x[i,] <- cumsum(x[i,]) }; x
}
colCumSums <- function(x) {
for(i in seq_len(dim(x)[2])) { x[,i] <- cumsum(x[,i]) }; x
}
This can be sped up more by using the plain cumsum
and subtracting off the sum so far when you get to the end of a column. For row cumulative sums, one needs to transpose twice.
colCumSums2 <- function(x) {
matrix(cumsum(rbind(x,-colSums(x))), ncol=ncol(x))[1:nrow(x),]
}
rowCumSums2 <- function(x) {
t(colCumSums2(t(x)))
}
That's really a hack though. Don't do it.