Search code examples
rmatrixo-d-matrix

Is there a possibility to merge several origin destination matrices into one dataframe?


I have created origin destination matrices for different weeks in the year, e.g. the output looks like:

Region 1 Region 2 Region 3
Region 1 0 8 1
Region 2 4 3 3
Region 3 2 2 3

Week 1

I have similar looking matrices for all weeks of the year, all representing activity between each pair of nodes. Now, I want to compute a dataframe which shows activity for all different pairs of origin-destination (13x13) per week in the year. How can I code this using R?


Solution

  • Obviously we don't have your data. I'll create a little example data set here so you can see one approach that should work for you.

    Suppose I have three matrices representing three weeks:

    mat1
    #>          Region 1 Region 2 Region 3
    #> Region 1        0        8        1
    #> Region 2        4        3        3
    #> Region 3        2        2        3
    
    mat2
    #>          Region 1 Region 2 Region 3
    #> Region 1        9        6        2
    #> Region 2        3        4        7
    #> Region 3        5        8        1
    
    mat3
    #>          Region 1 Region 2 Region 3
    #> Region 1        6        8        5
    #> Region 2        9        3        1
    #> Region 3        7        4        2
    

    (The code to recreate these matrices is shown at the bottom of this answer in a format you can copy and paste to your R console).

    The first thing to do is to get all your matrices into a list (if they are not already)

    my_list <- list(mat1, mat2, mat3)
    

    Now you can melt the matrices into data frames. Rather than doing this one at a times, we can do them all at once now that they are in a list by calling lapply:

    library(reshape2)
    
    my_dfs  <- lapply(my_list, melt)
    

    This will give us a list of data frames, one for each week. Now we need to bind these together into a single long data frame.

    df      <- do.call(rbind, my_dfs)
    

    Lastly, we want to add an extra column to the data frame so that we know which week the data comes from:

    df$week <- rep(seq(length(my_list)), each = length(mat1))
    

    And this gives us the final result:

    df
    #>        Var1     Var2 value week
    #> 1  Region 1 Region 1     0    1
    #> 2  Region 2 Region 1     4    1
    #> 3  Region 3 Region 1     2    1
    #> 4  Region 1 Region 2     8    1
    #> 5  Region 2 Region 2     3    1
    #> 6  Region 3 Region 2     2    1
    #> 7  Region 1 Region 3     1    1
    #> 8  Region 2 Region 3     3    1
    #> 9  Region 3 Region 3     3    1
    #> 10 Region 1 Region 1     9    2
    #> 11 Region 2 Region 1     3    2
    #> 12 Region 3 Region 1     5    2
    #> 13 Region 1 Region 2     6    2
    #> 14 Region 2 Region 2     4    2
    #> 15 Region 3 Region 2     8    2
    #> 16 Region 1 Region 3     2    2
    #> 17 Region 2 Region 3     7    2
    #> 18 Region 3 Region 3     1    2
    #> 19 Region 1 Region 1     6    3
    #> 20 Region 2 Region 1     9    3
    #> 21 Region 3 Region 1     7    3
    #> 22 Region 1 Region 2     8    3
    #> 23 Region 2 Region 2     3    3
    #> 24 Region 3 Region 2     4    3
    #> 25 Region 1 Region 3     5    3
    #> 26 Region 2 Region 3     1    3
    #> 27 Region 3 Region 3     2    3
    

    Created on 2022-03-11 by the reprex package (v2.0.1)


    Data

    mat1 <- structure(c(0L, 4L, 2L, 8L, 3L, 2L, 1L, 3L, 3L), .Dim = c(3L, 
    3L), .Dimnames = list(c("Region 1", "Region 2", "Region 3"), 
        c("Region 1", "Region 2", "Region 3")))
    
    mat2 <- structure(c(9L, 3L, 5L, 6L, 4L, 8L, 2L, 7L, 1L), .Dim = c(3L, 
    3L), .Dimnames = list(c("Region 1", "Region 2", "Region 3"), 
        c("Region 1", "Region 2", "Region 3")))
    
    mat3 <- structure(c(6L, 9L, 7L, 8L, 3L, 4L, 5L, 1L, 2L), .Dim = c(3L, 
    3L), .Dimnames = list(c("Region 1", "Region 2", "Region 3"), 
        c("Region 1", "Region 2", "Region 3")))