I have two datasets, and I need to be able to join on null column values
*there are many null values in the 'Model' column for this table
Customer | Brand | Model |
---|---|---|
Bill | Nike | Dunk |
Kayla | Adidas | Shoe 2 |
Max | Nike |
2)
SaleID | Customer | Brand | Model |
---|---|---|---|
1234 | Mike | Puma | X3 |
5678 | Bill | Nike | Dunk |
7433 | Max | Nike |
I want to join entire rows from table one with table 2, where essentially all three values in a row from table 1 act as a single record to be joined with table 2. (Bill, Nike, Dunk) is one value essentially
So far I have tried:
create table blank as
select columns,
columns,
columns,
from table 1 as x
left join table 2 as y
on x.customer=y.customer and x.brand=y.brand and x.model=y.model
;
quit;
The problem I am running into with this code is the join only includes rows where the 'Model' column is not null. There are many sales IDs with null 'Model' and I would like to be able to join these records.
For example, the final output of joining these tables with my code is:
| SaleID | Customer | Brand | Model |
| 5678 | Bill | Nike | Dunk |
Where I would like for there to be a record for Max, but since there is a null value in that column in that record it is not joined
SAS itself would match those records (unless the values of MODEL in one or the other dataset is not actually all blanks but includes some other invisible characters) because SAS uses strictly binary logic. A=B is either TRUE or FALSE.
But most external database systems (Oracle, etc.) use TRI-level logic for comparisons. A=B will be neither TRUE nor FALSE when either A or B is a NULL value.
So you need to explicitly account for the NULL values in the test condition.
create table want as
select y.saleid
, x.*
from table1 x
left join table2 y
on x.customer=y.customer
and x.brand=y.brand
and (x.model=y.model or (x.model is null and y.model is null))
;