I am attempting to use the Matrix package to bind two sparse matrices of different size together. The binding is on rows, using the column names for matching.
Table A:
ID | AAAA | BBBB |
------ | ------ | ------ |
XXXX | 1 | 2 |
Table B:
ID | BBBB | CCCC |
------ | ------ | ------ |
YYYY | 3 | 4 |
Binding table A and B:
ID | AAAA | BBBB | CCCC |
------ | ------ | ------ | ------ |
XXXX | 1 | 2 | |
YYYY | | 3 | 4 |
The intention is to insert a large number of small matrices into a single large matrix, to enable continuous querying and update/inserts.
I find that neither the Matrix or slam packages have functionality to handle this.
Similar questions have been asked in the past, but it seems no solution has been found:
Post 1: in-r-when-using-named-rows-can-a-sparse-matrix-column-be-added-concatenated
Post 2: bind-together-sparse-model-matrices-by-row-names
Ideas on how to solve it will be highly appreciated.
Best regards,
Frederik
It looks it's necessary to have empty columns (columns with 0s) added to the matrices so to make them compatible for a rbind
(matrices with the same column names, and on the same order). The following code does it:
# dummy data
set.seed(3344)
A = Matrix(matrix(rbinom(16, 2, 0.2), 4))
colnames(A)=letters[1:4]
B = Matrix(matrix(rbinom(9, 2, 0.2), 3))
colnames(B) = letters[3:5]
# finding what's missing
misA = colnames(B)[!colnames(B) %in% colnames(A)]
misB = colnames(A)[!colnames(A) %in% colnames(B)]
misAl = as.vector(numeric(length(misA)), "list")
names(misAl) = misA
misBl = as.vector(numeric(length(misB)), "list")
names(misBl) = misB
## adding missing columns to initial matrices
An = do.call(cbind, c(A, misAl))
Bn = do.call(cbind, c(B, misBl))[,colnames(An)]
# final bind
rbind(An, Bn)