Search code examples
sasproc-sqldatastep

Displaying 'NA' when it's an empty row in SAS


This is the given data

Company_NO Hierarchy_1 Hierarchy_2
1234       Insurance   A
1234       Insurance   A
1234       Auto        B
5678       Claims      B
5678       Claims      B
5678       New         C

In the above table, the column hierarchy_2 has three distinct values A,B,C. In the above example, for company_no=1234, since there is no row for hierarchy_2='C', that row should still appear. That is company_no=1234, hierarchy_1='NA', hierarchy_2='C'

Expected Output:

Company_NO Hierarchy_1 Hierarchy_2
1234       Insurance   A
1234       Insurance   A
1234       Auto        B
1234       NA          C
5678       Claims      B
5678       Claims      B
5678       New         C
5678       NA          A

As you can see above, an extra row is added with hierarchy_1='NA' whenever there is an empty row. Please help! Thank you!


Solution

  • One option is to first create all combinations of company_no and hierarchy_2 and then left join your dataset on this table:

    data have;
    length company_no 8. hierarchy_1 hierarchy_2 $20;
    input company_no hierarchy_1 $ hierarchy_2 $;
    datalines;
    1234     Insurance   A
    1234     Insurance   A
    1234     Auto        B
    5678     Claims      B
    5678     Claims      B
    5678     New         C
    ;
    run;
    
    proc sql;
       create table want as 
          select a.company_no
                ,case when missing(c.hierarchy_1) then "NA"
                      else c.hierarchy_1
                end as hierarchy_1
                ,b.hierarchy_2
          from (select distinct company_no from have) as a
          cross join (select distinct hierarchy_2 from have) as b
          left join have as c
             on a.company_no = c.company_no and
                b.hierarchy_2 = c.hierarchy_2
       ;
    quit;