Dividing columns by colSums in R

Per usual, Joris has a great answer. Two others that came to mind:

#Essentially your answer
f1 <- function() m / rep(colSums(m), each = nrow(m))
#Two calls to transpose
f2 <- function() t(t(m) / colSums(m))
#Joris
f3 <- function() sweep(m,2,colSums(m),`/`)

Joris' answer is the fastest on my machine:

> m <- matrix(rnorm(1e7), ncol = 10000)
> library(rbenchmark)
> benchmark(f1,f2,f3, replications=1e5, order = "relative")
  test replications elapsed relative user.self sys.self user.child sys.child
3   f3       100000   0.386   1.0000     0.385    0.001          0         0
1   f1       100000   0.421   1.0907     0.382    0.002          0         0
2   f2       100000   0.465   1.2047     0.386    0.003          0         0

See ?sweep, eg:

> sweep(m,2,colSums(m),`/`)
           [,1]      [,2]      [,3]
[1,] 0.08333333 0.1333333 0.1666667
[2,] 0.33333333 0.3333333 0.3333333
[3,] 0.58333333 0.5333333 0.5000000

or you can transpose the matrix and then colSums(m) gets recycled correctly. Don't forget to transpose afterwards again, like this :

> t(t(m)/colSums(m))
           [,1]      [,2]      [,3]
[1,] 0.08333333 0.1333333 0.1666667
[2,] 0.33333333 0.3333333 0.3333333
[3,] 0.58333333 0.5333333 0.5000000

Or you use the function prop.table() to do basically the same:

> prop.table(m,2)
           [,1]      [,2]      [,3]
[1,] 0.08333333 0.1333333 0.1666667
[2,] 0.33333333 0.3333333 0.3333333
[3,] 0.58333333 0.5333333 0.5000000

The time differences are rather small. the sweep() function and the t() trick are the most flexible solutions, prop.table() is only for this particular case

Dividing columns by colSums in R

Tags:

R

Related

Recent Posts