Search code examples
tuplescombinationsstatastata-macros

Stata - Generate all possible combinations


I need to find all possible combinations of the following variables, each containing a X number of observations

Variable Obs

  • Black 1
  • Pink 2
  • Yellow 6
  • Red 15
  • Green 17

e.g. (black, pink), (black, pink, yellow), (black, pink, yellow, red), (red, green).... Order is not important, so I must delete all the combinations that contain the same elements (black, pink) and (pink, black).

Also, at the end I would need to calculate the number of total observations per each combination.

What is the fastest method, which is also less prone to errors?

I read about Tuples but I am not able to write the code myself.


Solution

  • You can use tuples (to install ssc install tuples), like the example below. Note that I use postfile with a temporary name for the handle and temporary file for the results. After the loop is complete, I open the temporary file colors, and use gsort to sort in descending order.

    tuples black pink yellow red green 
    scalar black=1
    scalar pink=2
    scalar yellow=6
    scalar red=15
    scalar green=17
    
    tempname colors_handle
    tempfile colors
    postfile `colors_handle' str40 colors cnt using `colors', replace
    forvalues i = 1/`ntuples' {
        scalar sum = 0
        foreach n of local tuple`i' {
            scalar sum = sum + `n'
        }
        post `colors_handle' ("`tuple`i''") (sum)
    }
    postclose `colors_handle'
    use `colors',clear
    gsort -cnt
    list
    
    

    Output:

                                colors   cnt  
      1.   black pink yellow red green    41  
      2.         pink yellow red green    40  
      3.        black yellow red green    39  
      4.              yellow red green    38  
      5.          black pink red green    35  
      6.                pink red green    34  
      7.               black red green    33  
      8.                     red green    32  
      9.       black pink yellow green    26  
     10.             pink yellow green    25  
     11.         black pink yellow red    24  
     12.            black yellow green    24  
     13.                  yellow green    23  
     14.               pink yellow red    23  
     15.              black yellow red    22  
     16.                    yellow red    21  
     17.              black pink green    20  
     18.                    pink green    19  
     19.                   black green    18  
     20.                black pink red    18  
     21.                         green    17  
     22.                      pink red    17  
     23.                     black red    16  
     24.                           red    15  
     25.             black pink yellow     9  
     26.                   pink yellow     8  
     27.                  black yellow     7  
     28.                        yellow     6  
     29.                    black pink     3  
     30.                          pink     2  
     31.                         black     1