Search code examples
rsparse-matrix

Populating a sparse matrix in R


I'm populating a sparse matrix in R and have written the update in a for loop but was hoping to get some pointers to make it quicker. Here is some example code:

library(Matrix)

rowId <- rep(c(101:105), 2)
colId <- rep(c("A", "B"), 5)
count <- 1:10

data <- data.frame(as.character(rowId), colId, count)
names(data) <- c("rowId", "colId", "count")

sparse <- Matrix(nrow = 5, ncol=2, byrow=TRUE, 
                  dimnames = list(unique(rowId), unique(colId)))

for (i in 1:nrow(data)) {
  sparse[data$rowId[i], data$colId[i]] <- data$count[i]
}

Is there a better way to update the sparse matrix? In my real world problem, data has ~1 million observations and sparse is 25000x38242 and running sequentially is taking a few hours.

Thanks

Stuart


Solution

  • So the link to populating the sparse matrix requires 2 vectors passing in for the row/col values. So I this to dataframe and it worked:

    library(Matrix)
    
    rowId <- rep(c(101:105), 2)
    colId <- rep(c("A", "B"), 5)
    count <- 1:10
    
    rowIndex <- as.factor(rowId)
    colIndex <- as.factor(colId)
    rowIndex <- as.numeric(rowIndex)
    colIndex <- as.numeric(colIndex)
    
    data <- data.frame(rowIndex, rowId, colIndex, colId, count)
    
    sparse <- sparseMatrix(i=data$rowIndex, j=data$colIndex, x=data$count,
                           dimnames = list(unique(rowId), unique(colId)))