Search code examples
rstatamissing-datatabulate

How do I tabulate a variable in Stata to show all values that are in my sample, even if they're not yet in the dataset?


I am trying to tabulate a variable for a report at work to show the completion rate by municipality of a survey that is currently in the field. A dataset is downloaded each week showing all the responses so far, and I need to generate a table that shows frequencies for all the municipalities in our sample, even if they are zero. Using the missing option doesn't work because the dataset the report is made from doesn't have any way of telling stata what the overall sample of municipalities is.

Is there a way to get it to show zero for values that aren't in the dataset, based on a list of values in the overall sample? I'm competent in R so if someone has a suggestion for how to do this in R that would be good too.

So far this is my code:

convert string variable to numeric variable with value labels

    encode municipality_name, gen(municipality)

export survey completion by municipality to .doc file

    asdoc tab municipality, mis

Solution

  • The community-contributed command fre from SSC allows labelled values that do not occur in the dataset to be tabulated as having zero frequency. Here is an example:

    . sysuse auto, clear
    (1978 Automobile Data)
    
    . fre foreign
    
    foreign -- Car type
    ----------------------------------------------------------------
                       |      Freq.    Percent      Valid       Cum.
    -------------------+--------------------------------------------
    Valid   0 Domestic |         52      70.27      70.27      70.27
            1 Foreign  |         22      29.73      29.73     100.00
            Total      |         74     100.00     100.00           
    ----------------------------------------------------------------
    
    . label def origin 42 "Extraterrestrial", add
    
    . fre foreign
    
    foreign -- Car type
    ----------------------------------------------------------------
                       |      Freq.    Percent      Valid       Cum.
    -------------------+--------------------------------------------
    Valid   0 Domestic |         52      70.27      70.27      70.27
            1 Foreign  |         22      29.73      29.73     100.00
            Total      |         74     100.00     100.00           
    ----------------------------------------------------------------
    
    . fre foreign, includelabeled
    
    foreign -- Car type
    -------------------------------------------------------------------------
                                |      Freq.    Percent      Valid       Cum.
    ----------------------------+--------------------------------------------
    Valid   0  Domestic         |         52      70.27      70.27      70.27
            1  Foreign          |         22      29.73      29.73     100.00
            42 Extraterrestrial |          0       0.00       0.00     100.00
            Total               |         74     100.00     100.00           
    -------------------------------------------------------------------------
    
    . ssc desc fre
    
    --------------------------------------------------------------------------------------------------------
    package fre from http://fmwww.bc.edu/repec/bocode/f
    --------------------------------------------------------------------------------------------------------
    
    TITLE
          'FRE': module to display one-way frequency table
    
    DESCRIPTION/AUTHOR(S)
    
            fre displays, for each specified variable, a univariate
          frequency table containing counts, percent, and cumulative
          percent. Variables may be string or numeric. Labels, in full
          length, and values are printed. By default, fre only tabulates
          the smallest and largest 10 values (along with all missing
          values), but this can be changed. Furthermore,  values with zero
          observed frequency may be included in the  tables. The default
          for fre is to display the frequency  tables in the results
          window. Alternatively, the tables may be written to a file on
          disk, either tab-delimited or  LaTeX-formatted.
    
          KW: data management
          KW: frequencies
          KW: frequency table
          KW: tabulation
    
          Requires: Stata version 9.2
    
          Distribution-Date: 20150603
    
          Author: Ben Jann, University of Bern
          Support: email [email protected]
    
    
    INSTALLATION FILES                                  (type net install fre)
          fre.ado
          fre.hlp
    
    ANCILLARY FILES                                     (type net get fre)
          fre.zip
    --------------------------------------------------------------------------------------------------------
    (type ssc install fre to install)