Search code examples
sasmedical

SAS: PROC FREQ combinations automatically?


I have a patient dataset that looks like the below table and I would like to see which diseases run together and ultimately make a heatmap. I used PROC FREQ to make this list table, but it is too laborious to go through like this because it gives me every combination (thousands).

Moya    Hypothyroid Hyperthyroid    Celiac
   1       1           0             0
   1       1           0             0       
   0       0           1             1
   0       0           0             0
   1       1           0             0
   1       0           1             0
   1       1           0             0
   1       1           0             0
   0       0           1             1
   0       0           1             1


proc freq data=new;
tables HOHT*HOGD*CroD*Psor*Viti*CelD*UlcC*AddD*SluE*Rhea*PerA/list;
run;

I would ultimately like a bunch of cross tabs as I show below, so I can see how many patients have each combination. Obviously it's possible to copy paste each variable like this manually, but is there any way to see this quickly or automate this?

proc freq data=new;
tables HOHT*HOGD/list;
run;

proc freq data=new;
tables HOHT*CroD/list;
run;


proc freq data=new;
tables HOHT*Psor/list;
run;

Thanks!


Solution

  • One can control the tables generated in PROC FREQ with the TABLES statement. To generate tables that are 2-way contingency tables of all pairs of columns in a data set, one can write a SAS macro that loops through a list of variables, and generates TABLES statements to create all of the correct contingency tables.

    For example, using the data from the original post:

    data xtabs;
    input Moya    Hypothyroid Hyperthyroid    Celiac;
    datalines;
       1       1           0             0
       1       1           0             0       
       0       0           1             1
       0       0           0             0
       1       1           0             0
       1       0           1             0
       1       1           0             0
       1       1           0             0
       0       0           1             1
       0       0           1             1
    ;
    run;
    %macro gentabs(varlist=);
       %let word_count = %sysfunc(countw(&varlist));
       %do i = 1 %to (&word_count - 1);
          tables %scan(&varlist,&i,%str( )) * (
          %do j = %eval(&i + 1) %to &word_count;
            %scan(&varlist,&j,%str( ))
          %end; )
          ; /* end tables statement */
       %end;
    %mend;
    options mprint;
    proc freq data = xtabs;
      %gentabs(varlist=Moya Hypothyroid Hyperthyroid Celiac)
      run;
    

    The code generated by the SAS macro is:

     73         proc freq data = xtabs;
     74           %gentabs(varlist=Moya Hypothyroid Hyperthyroid Celiac)
     MPRINT(GENTABS):   tables Moya * ( Hypothyroid Hyperthyroid Celiac ) ;
     MPRINT(GENTABS):   tables Hypothyroid * ( Hyperthyroid Celiac ) ;
     MPRINT(GENTABS):   tables Hyperthyroid * ( Celiac ) ;
     75         run;
    

    ...and the first few tables from the resulting output looks like:

    enter image description here

    To add options to the TABLES statement, one would add code before the semicolon on the line commented as /* end tables statement */.