R convert matrix or data frame to sparseMatrix

For the matrix, someone already has an answer.

For the data.table, there is a package did the job.

library(Matrix)
library(mltools)
x = data.table()
sparseM <- sparsify(x) 

Josh's answer is fine, but here are more options and explanation.

Nit Picky "I have a regular matrix (non-sparse)..." Actually you do have a sparse matrix (matrix with mostly 0s); it's just in uncompressed format. Your goal is to put it in a compressed storage format.

Sparse matrices can be compressed into multiple storage formats. Compressed Sparse Column (CSC) and Compressed Sparse Row (CSR) are the two dominant formats. as(regMat, "sparseMatrix") converts your matrix to type dgCMatrix which is compressed sparse column. This is usually what you want, but I prefer to be explicit about it.

library(Matrix)

matCSC <- as(regMat, "dgCMatrix")  # compressed sparse column CSC
matCSC
10 x 10 sparse Matrix of class "dgCMatrix"

 [1,] . . .  .  . 57 .  . . .
 [2,] . . .  .  .  . . 27 . .
 [3,] . . .  . 90  . .  . . .
 [4,] . . .  .  .  . .  . . .
 [5,] . . .  .  .  . .  . . .
 [6,] . . .  .  .  . .  . . .
 [7,] . . . 91  .  . .  . . .
 [8,] . . . 37  .  . .  . . .
 [9,] . . .  .  .  . .  . . .
[10,] . . .  .  .  . .  . . .

matCSR <- as(regMat, "dgRMatrix")  # compressed sparse row CSR
matCSR
10 x 10 sparse Matrix of class "dgRMatrix"

 [1,] . . .  .  . 57 .  . . .
 [2,] . . .  .  .  . . 27 . .
 [3,] . . .  . 90  . .  . . .
 [4,] . . .  .  .  . .  . . .
 [5,] . . .  .  .  . .  . . .
 [6,] . . .  .  .  . .  . . .
 [7,] . . . 91  .  . .  . . .
 [8,] . . . 37  .  . .  . . .
 [9,] . . .  .  .  . .  . . .
[10,] . . .  .  .  . .  . . .

While these look and behave the same on the surface, internally they store data differently. CSC is faster for retrieving columns of data while CSR is faster for retrieving rows. They also take up different amounts of space depending on the structure of your data.

Furthermore, in this example you're converting an uncompressed sparse matrix to a compressed one. Usually you do this to save memory, so building an uncompressed matrix just to convert it to compressed form defeats the purpose. In practice it's more common to construct a compressed sparse matrix from a table of (row, column, value) triplets. You can do this with Matrix's sparseMatrix() function.

# Make data.frame of (row, column, value) triplets
df <- data.frame(
  rowIdx = c(3,2,8,1,7),
  colIdx = c(5,8,4,6,4),
  val = round(runif(n = 5), 2) * 100
)

df
  rowIdx colIdx val
1      3      5  90
2      2      8  27
3      8      4  37
4      1      6  57
5      7      4  91

# Build CSC matrix
matSparse <- sparseMatrix(
  i = df$rowIdx,
  j = df$colIdx, 
  x = df$val, 
  dims = c(10, 10)
)

matSparse
10 x 10 sparse Matrix of class "dgCMatrix"

 [1,] . . .  .  . 57 .  . . .
 [2,] . . .  .  .  . . 27 . .
 [3,] . . .  . 90  . .  . . .
 [4,] . . .  .  .  . .  . . .
 [5,] . . .  .  .  . .  . . .
 [6,] . . .  .  .  . .  . . .
 [7,] . . . 91  .  . .  . . .
 [8,] . . . 37  .  . .  . . .
 [9,] . . .  .  .  . .  . . .
[10,] . . .  .  .  . .  . . .

Shameless Plug - I have blog article covering this stuff and more if you're interested.


Here are two options:

library(Matrix)

A <- as(regMat, "sparseMatrix")       # see also `vignette("Intro2Matrix")`
B <- Matrix(regMat, sparse = TRUE)    # Thanks to Aaron for pointing this out

identical(A, B)
# [1] TRUE
A
# 10 x 10 sparse Matrix of class "dgCMatrix"
#                              
#  [1,] . . .  .  . 45 .  . . .
#  [2,] . . .  .  .  . . 59 . .
#  [3,] . . .  . 95  . .  . . .
#  [4,] . . .  .  .  . .  . . .
#  [5,] . . .  .  .  . .  . . .
#  [6,] . . .  .  .  . .  . . .
#  [7,] . . . 23  .  . .  . . .
#  [8,] . . . 63  .  . .  . . .
#  [9,] . . .  .  .  . .  . . .
# [10,] . . .  .  .  . .  . . .