I have a data table init
that looks like this:
> init
+---+
| id|
+---+
| a|
+---+
| b|
+---+
| c|
+---+
I want to obtain all pairs for id
column, so I need to cross join the init
data table on itself. Additionally, I want to exclude equal and symmetric results (in my case a,b == b,a
, etc.).
Desired output is:
+---+---+
|id1|id2|
+---+---+
| a| b|
| a| c|
| b| c|
+---+---+
How can this be done with the data.table
approach?
Full cross join can be implemented as
full_cj <- CJ(init$id, init$id)
:
+---+---+
| V1| V2|
+---+---+
| a| a|
| a| b|
| a| c|
| b| a|
| b| b|
| b| c|
| c| a|
| c| b|
| c| c|
+---+---+
But how can I remove symmetrical and identical results from the output?
My real data is huge, so I'm looking for an efficient solution.
You can use a non equi join after converting to factor
dt[, id2:=as.factor(id)][dt, on=.(id2>id2),nomatch=0,.(id, id2)]