Search code examples
rlistmatrixpurrrnested-tibble

R transpose 2 matrices into a list of tibbles (for a nested df)


I have two matrices, of latitude and longitude, both of which are 50 column x 1 million (e.g.) rows. I need to create a list of 1 million tibbles, each 2 columns - lon and lat - and 50 rows. My current code is:

lonlat <- list()
for (i in 1:nrow(lon)) {
  lonlat[[i]] <- tibble(lon = lon[i, ], lat = lat[i, ])
    }

I'm aware that this is incredibly inefficient, but I can't get my head around how I'd do this with purrr. I feel like map2 could be the answer, but I suspect I'm not thinking about this the right way, and possibly I should reorganise the input matrices in order to make it a simpler task.

Does anyone have any experience with purrr/map2, or this kind of problem? Thanks in advance for any ideas.


Solution

  • Your "50 columns" is 5 here; your "1 million rows" is 4 here.

    lat <- matrix(1:20, nr=4)
    lon <- matrix(50 + 1:20, nr=4)
    lat
    #      [,1] [,2] [,3] [,4] [,5]
    # [1,]    1    5    9   13   17
    # [2,]    2    6   10   14   18
    # [3,]    3    7   11   15   19
    # [4,]    4    8   12   16   20
    lon
    #      [,1] [,2] [,3] [,4] [,5]
    # [1,]   51   55   59   63   67
    # [2,]   52   56   60   64   68
    # [3,]   53   57   61   65   69
    # [4,]   54   58   62   66   70
    

    There your 1-million-long list is 4-long here, each with 2 columns and 5 rows.

    Map(tibble, lat=asplit(lat, 1), lon=asplit(lon, 1))
    # [[1]]
    # # A tibble: 5 x 2
    #     lat   lon
    #   <int> <dbl>
    # 1     1    51
    # 2     5    55
    # 3     9    59
    # 4    13    63
    # 5    17    67
    # [[2]]
    # # A tibble: 5 x 2
    #     lat   lon
    #   <int> <dbl>
    # 1     2    52
    # 2     6    56
    # 3    10    60
    # 4    14    64
    # 5    18    68
    # [[3]]
    # # A tibble: 5 x 2
    #     lat   lon
    #   <int> <dbl>
    # 1     3    53
    # 2     7    57
    # 3    11    61
    # 4    15    65
    # 5    19    69
    # [[4]]
    # # A tibble: 5 x 2
    #     lat   lon
    #   <int> <dbl>
    # 1     4    54
    # 2     8    58
    # 3    12    62
    # 4    16    66
    # 5    20    70
    

    If you really want to use purrr, then

    purrr::map2(asplit(lat, 1), asplit(lon, 1), ~ tibble(lat=.x, lon=.y))