I have the following data based on distance between the cities.
Source Destination Distance
USA UK 1000
USA Spain 200
UK USA 1000
Germany Spain 500
Spain USA 200
I want to remove the duplicates where source and destination are same. For Example USA to UK will be same as UK to USA and hence the duplicate value needs to be removed.
Following is the desired output.
Source Destination Distance
USA UK 1000
USA Spain 200
Germany Spain 500
First produce a dummy variable to hold sorted source and destination by call sortc, then sort by dummy variable.
data have;
input Source $ Destination $ Distance;
cards;
USA UK 1000
USA Spain 200
UK USA 1000
Germany Spain 500
Spain USA 200
;
data temp;
set have;
length dummy $50.;
_var1=source; _var2=destination;
call sortc (of _:);
dummy=catx(' ',of _:);
drop _:;
run;
proc sort data=temp out=want(drop=dummy) nodupkey;
by dummy;
run;