Search code examples
rlistdata.tablecopyassign

How to make a list of multiple copies of the same data.table without looping? (and not assigning by reference)


I was wondering if there is a cleaner one-liner to make a list of multiple copies of the same data.table object not by reference, avoiding the need for a loop.

The use case is that I want to create m multiple data.tables that will differ only by the values of a single column. Hence I would make a list with equal objects, then iterate over every data.table and modify the values of one column, ending up with m different data.tables.

Currently my clunky inelegant approach is:

library(data.table)

  # Load data and create data.table
data("airquality")
dt <- data.table(airquality)

  # Number of copies
m = 3

  # Pre-allocate
dt_list <- list()

  # Populate the list with `m` copies of the original data.table
for (i in 1:m) {
  dt_list[[i]] <- copy(dt)
}

# Assign by reference within each data.table
for (i in 1:m) {
  dt_list[[i]][, Ozone := rep(i)]
}

dt_list
# [[1]]
# Ozone Solar.R Wind Temp Month Day
# 1:     1     190  7.4   67     5   1
# 2:     1     118  8.0   72     5   2
# 3:     1     149 12.6   74     5   3
# 4:     1     313 11.5   62     5   4
# 5:     1      NA 14.3   56     5   5

# [[2]]
# Ozone Solar.R Wind Temp Month Day
# 1:     2     190  7.4   67     5   1
# 2:     2     118  8.0   72     5   2
# 3:     2     149 12.6   74     5   3
# 4:     2     313 11.5   62     5   4
# 5:     2      NA 14.3   56     5   5

# [[3]]
# Ozone Solar.R Wind Temp Month Day
# 1:     3     190  7.4   67     5   1
# 2:     3     118  8.0   72     5   2
# 3:     3     149 12.6   74     5   3
# 4:     3     313 11.5   62     5   4
# 5:     3      NA 14.3   56     5   5

Other similar questions consider different data.table objects or they assign by reference, so that all copies are also changed when one is changed. For example by doing dt_list <- rep(list(dt), m), all Ozone columns are filled with 3s.


Solution

  • One way to solve your problem:

    dt_list = lapply(1:3, \(i) copy(dt)[, Ozone := i])
    # or
    dt_list = lapply(1:3, \(i) set(copy(dt), j="Ozone", value=i))
    
    [[1]]
         Ozone Solar.R  Wind  Temp Month   Day
         <int>   <int> <num> <int> <int> <int>
      1:     1     190   7.4    67     5     1
      2:     1     118   8.0    72     5     2
      3:     1     149  12.6    74     5     3
      4:     1     313  11.5    62     5     4
      5:     1      NA  14.3    56     5     5
     ---                                      
    149:     1     193   6.9    70     9    26
    150:     1     145  13.2    77     9    27
    151:     1     191  14.3    75     9    28
    152:     1     131   8.0    76     9    29
    153:     1     223  11.5    68     9    30
    
    [[2]]
         Ozone Solar.R  Wind  Temp Month   Day
         <int>   <int> <num> <int> <int> <int>
      1:     2     190   7.4    67     5     1
      2:     2     118   8.0    72     5     2
      3:     2     149  12.6    74     5     3
      4:     2     313  11.5    62     5     4
      5:     2      NA  14.3    56     5     5
     ---                                      
    149:     2     193   6.9    70     9    26
    150:     2     145  13.2    77     9    27
    151:     2     191  14.3    75     9    28
    152:     2     131   8.0    76     9    29
    153:     2     223  11.5    68     9    30
    
    [[3]]
         Ozone Solar.R  Wind  Temp Month   Day
         <int>   <int> <num> <int> <int> <int>
      1:     3     190   7.4    67     5     1
      2:     3     118   8.0    72     5     2
      3:     3     149  12.6    74     5     3
      4:     3     313  11.5    62     5     4
      5:     3      NA  14.3    56     5     5
     ---                                      
    149:     3     193   6.9    70     9    26
    150:     3     145  13.2    77     9    27
    151:     3     191  14.3    75     9    28
    152:     3     131   8.0    76     9    29
    153:     3     223  11.5    68     9    30