Search code examples
rdataframedata.tablevariable-assignmentcolon-equals

Assign multiple columns using := in data.table, by group


What is the best way to assign to multiple columns using data.table? For example:

f <- function(x) {c("hi", "hello")}
x <- data.table(id = 1:10)

I would like to do something like this (of course this syntax is incorrect):

x[ , (col1, col2) := f(), by = "id"]

And to extend that, I may have many columns with names stored in a variable (say col_names) and I would like to do:

x[ , col_names := another_f(), by = "id", with = FALSE]

What is the correct way to do something like this?


Solution

  • This now works in v1.8.3 on R-Forge. Thanks for highlighting it!

    x <- data.table(a = 1:3, b = 1:6) 
    f <- function(x) {list("hi", "hello")} 
    x[ , c("col1", "col2") := f(), by = a][]
    #    a b col1  col2
    # 1: 1 1   hi hello
    # 2: 2 2   hi hello
    # 3: 3 3   hi hello
    # 4: 1 4   hi hello
    # 5: 2 5   hi hello
    # 6: 3 6   hi hello
    
    x[ , c("mean", "sum") := list(mean(b), sum(b)), by = a][]
    #    a b col1  col2 mean sum
    # 1: 1 1   hi hello  2.5   5
    # 2: 2 2   hi hello  3.5   7
    # 3: 3 3   hi hello  4.5   9
    # 4: 1 4   hi hello  2.5   5
    # 5: 2 5   hi hello  3.5   7
    # 6: 3 6   hi hello  4.5   9 
    
    mynames = c("Name1", "Longer%")
    x[ , (mynames) := list(mean(b) * 4, sum(b) * 3), by = a]
    #     a b col1  col2 mean sum Name1 Longer%
    # 1: 1 1   hi hello  2.5   5    10      15
    # 2: 2 2   hi hello  3.5   7    14      21
    # 3: 3 3   hi hello  4.5   9    18      27
    # 4: 1 4   hi hello  2.5   5    10      15
    # 5: 2 5   hi hello  3.5   7    14      21
    # 6: 3 6   hi hello  4.5   9    18      27
    


    x[ , get("mynames") := list(mean(b) * 4, sum(b) * 3), by = a][]  # same
    #    a b col1  col2 mean sum Name1 Longer%
    # 1: 1 1   hi hello  2.5   5    10      15
    # 2: 2 2   hi hello  3.5   7    14      21
    # 3: 3 3   hi hello  4.5   9    18      27
    # 4: 1 4   hi hello  2.5   5    10      15
    # 5: 2 5   hi hello  3.5   7    14      21
    # 6: 3 6   hi hello  4.5   9    18      27
    
    x[ , eval(mynames) := list(mean(b) * 4, sum(b) * 3), by = a][]   # same
    #    a b col1  col2 mean sum Name1 Longer%
    # 1: 1 1   hi hello  2.5   5    10      15
    # 2: 2 2   hi hello  3.5   7    14      21
    # 3: 3 3   hi hello  4.5   9    18      27
    # 4: 1 4   hi hello  2.5   5    10      15
    # 5: 2 5   hi hello  3.5   7    14      21
    # 6: 3 6   hi hello  4.5   9    18      27
    

    Older version using the with argument (we discourage this argument when possible):

    x[ , mynames := list(mean(b) * 4, sum(b) * 3), by = a, with = FALSE][] # same
    #    a b col1  col2 mean sum Name1 Longer%
    # 1: 1 1   hi hello  2.5   5    10      15
    # 2: 2 2   hi hello  3.5   7    14      21
    # 3: 3 3   hi hello  4.5   9    18      27
    # 4: 1 4   hi hello  2.5   5    10      15
    # 5: 2 5   hi hello  3.5   7    14      21
    # 6: 3 6   hi hello  4.5   9    18      27