I have build a table where the headers each column except the first column is a unique value
, and every cell in the first column is a unique value
- just the headers but vertical.
values | value1 | value2 | value3 |
---|---|---|---|
value1 | |||
value2 | |||
value3 |
I also have a dataset group
. Every unique group
can have multiple unique values
.
group | value |
---|---|
group1 | value1 |
group1 | value2 |
group1 | value3 |
group2 | value2 |
group2 | value3 |
group2 | value4 |
group2 | value5 |
I want to see the amount of times every unique value
appears alongside any other unique value
that is not itself within the same group
by populating my first table.
Desired output, given the above example group
dataset:
values | value1 | value2 | value3 |
---|---|---|---|
value1 | 0 | 1 | 1 |
value2 | 1 | 0 | 2 |
value3 | 1 | 2 | 0 |
I have difficulties finding answers as I am not sure what the type of output I want is called. How would I go about achieving my desired results?
Thank you for your help.
Create a cross-join that counts the number of times each value matches with a different value:
proc sql noprint;
create table count_values as
select t1.value as value_t1
, t2.value as value_t2
, count(*) as total
from have as t1, have as t2
where t1.group = t2.group
AND t1.value NE t2.value
group by t1.value, t2.value
;
quit;
Which gives you this:
value_t1 value_t2 total
value1 value2 1
value1 value3 1
value2 value1 1
value2 value3 2
value3 value1 1
value3 value2 2
Now we can do a few different things. You can use proc tabulate
and generate your table:
options missing='0';
proc tabulate data = count_values
out = totals
format = 8.0;
class value_t1 value_t2;
var total;
/* ' ' eliminates row headings */
table value_t1 = ' ',
value_t2 = ' '*total = ' '*sum=' '
;
run;
Or, you can turn it into a dataset:
proc transpose data = count_values
out = count_values_tpose(drop=_NAME_);
by value_t1;
id value_t2;
var total;
run;
data want;
/* Set var order */
length value_t1 $8.
value1-value3 8.
;
set count_values_tpose;
array value[*] value1-value3;
/* Set missing to 0 */
do i = 1 to dim(value);
if(value[i] = .) then value[i] = 0;
end;
rename value_t1 = values;
drop i;
run;
Output:
values value1 value2 value3
value1 0 1 1
value2 1 0 2
value3 1 2 0