This is the given data
Company_NO Hierarchy_1 Hierarchy_2
1234 Insurance A
1234 Insurance A
1234 Auto B
5678 Claims B
5678 Claims B
5678 New C
In the above table, the column hierarchy_2 has three distinct values A,B,C. In the above example, for company_no=1234, since there is no row for hierarchy_2='C', that row should still appear. That is company_no=1234, hierarchy_1='NA', hierarchy_2='C'
Expected Output:
Company_NO Hierarchy_1 Hierarchy_2
1234 Insurance A
1234 Insurance A
1234 Auto B
1234 NA C
5678 Claims B
5678 Claims B
5678 New C
5678 NA A
As you can see above, an extra row is added with hierarchy_1='NA' whenever there is an empty row. Please help! Thank you!
One option is to first create all combinations of company_no and hierarchy_2 and then left join your dataset on this table:
data have;
length company_no 8. hierarchy_1 hierarchy_2 $20;
input company_no hierarchy_1 $ hierarchy_2 $;
datalines;
1234 Insurance A
1234 Insurance A
1234 Auto B
5678 Claims B
5678 Claims B
5678 New C
;
run;
proc sql;
create table want as
select a.company_no
,case when missing(c.hierarchy_1) then "NA"
else c.hierarchy_1
end as hierarchy_1
,b.hierarchy_2
from (select distinct company_no from have) as a
cross join (select distinct hierarchy_2 from have) as b
left join have as c
on a.company_no = c.company_no and
b.hierarchy_2 = c.hierarchy_2
;
quit;