I'm having trouble efficiently loading data into a sparse matrix format in R.
Here is an (incomplete) example of my current strategy:
library(Matrix)
a1=Matrix(0,5000,100000,sparse=T)
for(i in 1:5000)
a1[i,idxOfCols]=x
Where x is usually around length 20. This is not efficient and eventually slows to a crawl. I know there is a better way but wasn't sure how. Suggestions?
You can populate the matrix all at once:
library(Matrix)
n <- 5000
m <- 1e5
k <- 20
idxOfCols <- sample(1:m, k)
x <- rnorm(k)
a2 <- sparseMatrix(
i=rep(1:n, each=k),
j=rep(idxOfCols, n),
x=rep(x, k),
dims=c(n,m)
)
# Compare
a1 <- Matrix(0,5000,100000,sparse=T)
for(i in 1:n) {
a1[i,idxOfCols] <- x
}
sum(a1 - a2) # 0