I have multiple matrices reflecting bipartite / affiliation networks at different time points. These matrices have a lot of overlap in their incumbents, but also a lot of differences. For further analysis, however, I need them to be the same dimensions and have the same actors per row/column, so I need to combine row and column names somehow.
The final matrices will be around 8000 times 200, but each individual matrix is around 2000 times 150. Here is an example of two matrices and how I want the result to look like:
adj1 <- matrix(0, 3, 5)
colnames(adj1) <- c("g1", "g2", "g3", "g5", "g6")
rownames(adj1) <- c("Tim", "John", "Sarah")
adj2 <- matrix(0, 4, 2)
colnames(adj2) <- c("g1", "g4")
rownames(adj2) <- c("Tim", "Mary", "John", "Paolo")
combined_adj <- matrix(0,5,6)
colnames(combined_adj) <- c("g1","g2","g3","g4","g5","g6")
rownames(combined_adj) <- c("John","Mary","Paolo","Sarah","Tim")
Ideally, the new cells should read "NA" or "10" and rows and columns would be ordered alphabetically. The initial values in each matrix need to be kept. I am at a loss of what to do here and appreciate any help!
You can use merge and specify that you want to use row.names
for merging as well.
combined_adj <- merge(x = adj1,
y = adj2,
by = c('row.names',
intersect(colnames(adj1),
colnames(adj2))
),
all = TRUE
)
combined_adj
Row.names g1 g2 g3 g5 g6 g4
1 John 0 0 0 0 0 0
2 Mary 0 NA NA NA NA 0
3 Paolo 0 NA NA NA NA 0
4 Sarah 0 0 0 0 0 NA
5 Tim 0 0 0 0 0 0
This turns it into a data.frame, so you will need to convert it back to a matrix if required.
row.names(combined_adj) <- combined_adj[,1]
combined_adj <- combined_adj[,-1]
We use Reduce
to apply it over all matrices. We first convert to data.frame however and create a column with row_names to simplify things.
# create sample data
adj1 <- matrix(
0, 3, 5,
dimnames = list(c("Tim", "John", "Sarah"),
c("g1", "g2", "g3", "g5", "g6"))
)
adj2 <- matrix(
0, 4, 2,
dimnames = list(c("Tim", "Mary", "John", "Paolo"),
c("g1", "g4"))
)
adj3 <- matrix(
0, 3, 3,
dimnames = list(c("Tim2", "Mary2", "John"), c("g1", "g4", 'g7'))
)
# create a list
list_matrices <- list(adj1, adj2, adj3)
# convert to dataframes and create a column with row.names
list_matrices <- lapply(list_matrices, function(mat){
mat <- as.data.frame(mat)
mat$row_names <- row.names(mat)
mat
})
# successively combine them, merge 1..2 and then merge result with 3 and so on
res <- Reduce(function(mat1, mat2) merge(mat1, mat2, all = TRUE), x = list_matrices)
res
g1 row_names g4 g2 g3 g5 g6 g7
1 0 John 0 0 0 0 0 0
2 0 Mary 0 NA NA NA NA NA
3 0 Mary2 0 NA NA NA NA 0
4 0 Paolo 0 NA NA NA NA NA
5 0 Sarah NA 0 0 0 0 NA
6 0 Tim 0 0 0 0 0 NA
7 0 Tim2 0 NA NA NA NA 0