Search code examples
rdata.tablesubsetattribution

Attribution in a data.table via clever subsetting


I´m trying the following clever attribution using data.table:

for (s in c('G1', 'G2') {

  t[t[  , .I[seq(which.max(get(s)), .N)], by = GROUP]$V1, get(s) := 1]  

} 

For some reason, it complains issuing an error:

Error in get(s) : object 'G1' not found

But, the explicit form works nicely:

t[t[  , .I[seq(which.max(G1), .N)], by = GROUP]$V1, G1 := 1]  

Of course, my real dataset has many columns which names unknown in advance. What am I doing wrong here?

Here a sample dataset:

G1  G2  GROUP
0.081975988 0.281210522 A
0.726230621 0.91873287  A
0.938997082 0.146669516 A
0.10564305  0.219593442 A
0.112977071 0.451366779 A
0.157260728 0.570366021 A
0.586841571 0.742955139 B
0.418178989 0.584326765 B
0.290443749 0.435277405 B
0.682695255 0.138739152 B
0.992847073 0.198544311 B
0.401170904 0.347155973 B
0.591182359 0.219964292 C
0.003935376 0.231136145 C
0.666710774 0.479126371 C
0.791187106 0.153873696 C
0.921437692 0.31429481  C
0.88193519  0.801150898 C

Solution

  • The problem is in the assignment part, where you use s as a symbol whose value is the new column name, while get gives back the column object, which is essentially a vector; What you need is (s) := 1, also see Select / assign to data.table variables which names are stored in a character vector .

    for (s in c('G1', 'G2')) {
      t[t[, .I[seq(which.max(get(s)), .N)], by = GROUP]$V1, (s) := 1][] 
    }
    
    t
    #             G1        G2 GROUP
    # 1: 0.081975988 0.2812105     A
    # 2: 0.726230621 1.0000000     A
    # 3: 1.000000000 1.0000000     A
    # 4: 1.000000000 1.0000000     A
    # 5: 1.000000000 1.0000000     A
    # 6: 1.000000000 1.0000000     A
    # 7: 0.586841571 1.0000000     B
    # 8: 0.418178989 1.0000000     B
    # 9: 0.290443749 1.0000000     B
    #10: 0.682695255 1.0000000     B
    #11: 1.000000000 1.0000000     B
    #12: 1.000000000 1.0000000     B
    #13: 0.591182359 0.2199643     C
    #14: 0.003935376 0.2311361     C
    #15: 0.666710774 0.4791264     C
    #16: 0.791187106 0.1538737     C
    #17: 1.000000000 0.3142948     C
    #18: 1.000000000 1.0000000     C