Fastest way to multiply matrix columns with vector elements in R
Use some linear algebra and perform matrix multiplication, which is quite fast in R
.
eg
m %*% diag(v)
some benchmarking
m = matrix(rnorm(1200000), ncol=6)
v=c(1.5, 3.5, 4.5, 5.5, 6.5, 7.5)
library(microbenchmark)
microbenchmark(m %*% diag(v), t(t(m) * v))
## Unit: milliseconds
## expr min lq median uq max neval
## m %*% diag(v) 16.57174 16.78104 16.86427 23.13121 109.9006 100
## t(t(m) * v) 26.21470 26.59049 32.40829 35.38097 122.9351 100
If you have a larger number of columns your t(t(m) * v) solution outperforms the matrix multiplication solution by a wide margin. However, there is a faster solution, but it comes with a high cost in in memory usage. You create a matrix as large as m using rep() and multiply elementwise. Here's the comparison, modifying mnel's example:
m = matrix(rnorm(1200000), ncol=600)
v = rep(c(1.5, 3.5, 4.5, 5.5, 6.5, 7.5), length = ncol(m))
library(microbenchmark)
microbenchmark(t(t(m) * v),
m %*% diag(v),
m * rep(v, rep.int(nrow(m),length(v))),
m * rep(v, rep(nrow(m),length(v))),
m * rep(v, each = nrow(m)))
# Unit: milliseconds
# expr min lq mean median uq max neval
# t(t(m) * v) 17.682257 18.807218 20.574513 19.239350 19.818331 62.63947 100
# m %*% diag(v) 415.573110 417.835574 421.226179 419.061019 420.601778 465.43276 100
# m * rep(v, rep.int(nrow(m), ncol(m))) 2.597411 2.794915 5.947318 3.276216 3.873842 48.95579 100
# m * rep(v, rep(nrow(m), ncol(m))) 2.601701 2.785839 3.707153 2.918994 3.855361 47.48697 100
# m * rep(v, each = nrow(m)) 21.766636 21.901935 23.791504 22.351227 23.049006 66.68491 100
As you can see, using "each" in rep() sacrifices speed for clarity. The difference between rep.int and rep seems to be neglible, both implementations swap places on repeated runs of microbenchmark(). Keep in mind, that ncol(m) == length(v).
For the sake of completeness, I added sweep
to the benchmark. Despite its somewhat misleading attribute names, I think it may be more readable than other alternatives, and also quite fast:
n = 1000
M = matrix(rnorm(2 * n * n), nrow = n)
v = rnorm(2 * n)
microbenchmark::microbenchmark(
M * rep(v, rep.int(nrow(M), length(v))),
sweep(M, MARGIN = 2, STATS = v, FUN = `*`),
t(t(M) * v),
M * rep(v, each = nrow(M)),
M %*% diag(v)
)
Unit: milliseconds
expr min lq mean
M * rep(v, rep.int(nrow(M), length(v))) 5.259957 5.535376 9.994405
sweep(M, MARGIN = 2, STATS = v, FUN = `*`) 16.083039 17.260790 22.724433
t(t(M) * v) 19.547392 20.748929 29.868819
M * rep(v, each = nrow(M)) 34.803229 37.088510 41.518962
M %*% diag(v) 1827.301864 1876.806506 2004.140725
median uq max neval
6.158703 7.606777 66.21271 100
20.479928 23.830074 85.24550 100
24.722213 29.222172 92.25538 100
39.920664 42.659752 106.70252 100
1986.152972 2096.172601 2432.88704 100
As @Arun points out, I don't know that you'll beat your solution in terms of time efficiency. In terms of code understandability, there are other options though:
One option:
> mapply("*",as.data.frame(m),v)
V1 V2 V3
[1,] 0.0 0.0 0.0
[2,] 1.5 0.0 0.0
[3,] 1.5 3.5 0.0
[4,] 1.5 3.5 4.5
And another:
sapply(1:ncol(m),function(x) m[,x] * v[x] )