Dividing columns by colSums in R
Per usual, Joris has a great answer. Two others that came to mind:
#Essentially your answer
f1 <- function() m / rep(colSums(m), each = nrow(m))
#Two calls to transpose
f2 <- function() t(t(m) / colSums(m))
#Joris
f3 <- function() sweep(m,2,colSums(m),`/`)
Joris' answer is the fastest on my machine:
> m <- matrix(rnorm(1e7), ncol = 10000)
> library(rbenchmark)
> benchmark(f1,f2,f3, replications=1e5, order = "relative")
test replications elapsed relative user.self sys.self user.child sys.child
3 f3 100000 0.386 1.0000 0.385 0.001 0 0
1 f1 100000 0.421 1.0907 0.382 0.002 0 0
2 f2 100000 0.465 1.2047 0.386 0.003 0 0
See ?sweep
, eg:
> sweep(m,2,colSums(m),`/`)
[,1] [,2] [,3]
[1,] 0.08333333 0.1333333 0.1666667
[2,] 0.33333333 0.3333333 0.3333333
[3,] 0.58333333 0.5333333 0.5000000
or you can transpose the matrix and then colSums(m)
gets recycled correctly. Don't forget to transpose afterwards again, like this :
> t(t(m)/colSums(m))
[,1] [,2] [,3]
[1,] 0.08333333 0.1333333 0.1666667
[2,] 0.33333333 0.3333333 0.3333333
[3,] 0.58333333 0.5333333 0.5000000
Or you use the function prop.table()
to do basically the same:
> prop.table(m,2)
[,1] [,2] [,3]
[1,] 0.08333333 0.1333333 0.1666667
[2,] 0.33333333 0.3333333 0.3333333
[3,] 0.58333333 0.5333333 0.5000000
The time differences are rather small. the sweep()
function and the t()
trick are the most flexible solutions, prop.table()
is only for this particular case