Search code examples
stataunique-values

How to get unique values for all variables in a dataset


I'm using Stata. I have a dataset with approximately 1800 observations and 1050 variables. Most of them are categorical variables with a few categories. It looks something like this:

------------------------------------------------------
| id | fh_1 | fh_1a | fh_2 | fh_2a | fh_3 | fh_3a |...
------------------------------------------------------
|1111| 1    |closed | 2    | 4     | 1    | open  |...
------------------------------------------------------
|1112| 2    | open  | 1    | 2     | 3    | closed|...
------------------------------------------------------
.
.
.

I need to export to an Excel sheet the list of all variables in this dataset with all unique values for each variable. It should look something like this:

--------------------------
|variable | unique_values|
--------------------------
| fh      | 1 2 3 4 5    |
--------------------------
|fh_1a    | closed open  |
--------------------------
.
.
.

I think I need a loop with the command levelsof but I'm not sure how to build it. Any suggestions?


Solution

  • foreach v of var * { 
        levelsof `v' 
    }
    

    would be a start, but I haven't directly addressed how to make that output Excel-friendly.

    One possibility is to put all the output in string variables given that the number of observations exceeds the number of variables.

    gen varname = "" 
    gen levels = "" 
    local i = 1 
    foreach v of var * { 
        levelsof `v' 
        replace varname = "`v'" in `i' 
        replace levels = `"`r(levels)'"' in `i' 
        local ++i 
    }