Search code examples
statastata-macros

Drop variables with all missing values


I have 5000 variables and 91,534 observations in my dataset.

I want to drop all variables that have all their values missing:

X1     X2    X3
1      2      .
.      3      .
3      .      .
.      5      .

X1     X2
1      2  
.      3   
3      . 
.      5  

I tried using the dropmiss community-contributed command, but it does not seem to be working for me even after reading the help file. For example:

dropmiss 
command dropmiss is unrecognized
r(199);

missings dropvars
force option required with changed dataset

Instead, as suggested in one of the solutions, I tried the following:

ssc install nmissing
nmissing, min(91534)  
drop `r(varlist)'

This alternative community-contributed command seems to work for me.

However, I wanted to know if there is a more elegant solution, or a way to use dropmiss.


Solution

  • In an up-to-date Stata either search dropmiss or search nmissing will tell you that both commands are superseded by missings from the Stata Journal.

    The following dialogue may illuminate your question:

    . sysuse auto , clear
    (1978 Automobile Data)
    
    . generate empty = .
    (74 missing values generated)
    
    . missings dropvars
    force option required with changed dataset
    r(4);
    
    . missings dropvars, force
    
    Checking missings in make price mpg rep78 headroom trunk weight length turn
        displacement gear_ratio foreign empty:
    74 observations with missing values
    
    note: empty dropped
    

    missings dropvars, once installed, will drop all variables that are entirely missing, except that you need the force option if the dataset in memory has not been saved.