Search code examples
rdata.table

Get all pairs by groups in data.table


I want to extract all pairs within groups of a data.table by two columns.

I have the following code:


dt = data.table(g1 = c(1, 1, 1, 2, 2, 2, 2, 2),
                g2 = c(3, 3, 4, 5, 5, 6, 6, 6),
                values = letters[1:8])

dt[, .(ind = combn(values, 2, paste0, collapse = "")), by = .(g1, g2)]

But this returns the error

Error in combn(values, 2, paste0, collapse = "") : n < m

The ideal result is some object that contains groups such as:

a,b
c
d,e
f,g
f,h
g,h

If c was not included because there is only one value, that is not a huge concern to me.

Is this possible to get using data.table?

Thanks in advance.

Note: This answer does what I want for one level. Generate All ID Pairs, by group with data.table in R


Solution

  • You can try the code below

    > setnames(dt[, combn(values, min(2, .N), toString), by = .(g1, g2)], "V1", "ind")[]
          g1    g2    ind
       <num> <num> <char>
    1:     1     3   a, b
    2:     1     4      c
    3:     2     5   d, e
    4:     2     6   f, g
    5:     2     6   f, h
    6:     2     6   g, h
    

    or

    > dt[, .(ind = combn(values, min(2, .N), toString, FALSE)), by = .(g1, g2)][, ind := unlist(ind)][]
          g1    g2    ind
       <num> <num> <char>
    1:     1     3   a, b
    2:     1     4      c
    3:     2     5   d, e
    4:     2     6   f, g
    5:     2     6   f, h
    6:     2     6   g, h
    

    where min(2, .N) is to rectify the number of selected elements in case that it is less than 2.