Search code examples
stata

Stata - Generate unique combinations ignoring order


I am trying to create a dataset that lists all combinations of rows by ID, but ignores the order of combinations (i.e. Apple -> Orange is the same as Orange -> Apple). I started by using this question, which gets me all combinations, keeping different orders. How can I modify this code for my desired output? Or should I being using a different approach altogether, and what would the approach be?

Code of what I currently do using linked question:

clear all

input id str10 fruit 
1 "Apple"
1 "Banana"
1 "Orange"
2 "Orange"
2 "Apple"
3 "Pear"
3 "Kiwi"
3 "Apple"
3 "Lemon"
end

tempfile t1
save `t1'
clear

use `t1'
rename fruit f2
keep id f2
joinby id using `t1'
order id fruit f2
sort id  fruit f2

drop if fruit==f2

list, sepby(id)

My desired output is:

ID  fruit   f2
1   Apple   Banana
1   Apple   Orange
1   Banana  Orange
2   Orange  Apple
3   Pear    Kiwi
3   Pear    Apple
3   Pear    Lemon
3   Kiwi    Apple
3   Kiwi    Lemon
3   Apple   Lemon

Solution

  • After joinby, you could generate an auxiliary variable which puts fruit and f2 in one variable and sorts them, assuring that the same combinations have the same values. Then you can use duplicates drop in terms of this variable and id to drop duplicates.

    clear 
    input id str10 fruit 
    1 "Apple"
    1 "Banana"
    1 "Orange"
    2 "Orange"
    2 "Apple"
    3 "Pear"
    3 "Kiwi"
    3 "Apple"
    3 "Lemon"
    end
    
    tempfile t1
    save `t1'
    
    rename fruit f2
    joinby id using `t1'
    order id fruit f2
    sort id  fruit f2
    
    drop if fruit==f2
    gen combination = cond(fruit < f2, fruit + " " + f2, f2 + " " + fruit)
    duplicates drop id combination, force
    drop combination
    
    list, sepby(id)
    
         +----------------------+
         | id    fruit       f2 |
         |----------------------|
      1. |  1    Apple   Banana |
      2. |  1    Apple   Orange |
      3. |  1   Banana   Orange |
         |----------------------|
      4. |  2    Apple   Orange |
         |----------------------|
      5. |  3    Apple     Kiwi |
      6. |  3    Apple    Lemon |
      7. |  3    Apple     Pear |
      8. |  3     Kiwi    Lemon |
      9. |  3     Kiwi     Pear |
     10. |  3    Lemon     Pear |
         +----------------------+