I have two tables and a section of Table1 looks something like this
EmployeeID | FirstName | LastName | Gender | Age |
---|---|---|---|---|
A100 | Bob | Odenkirk | Male | 30 |
A101 | Jon | Jones | NULL | 36 |
Table2 looks similar but contains a duplicate entry with some varying amount of missing/available information, i.e.,
EmployeeID | FirstName | LastName | Gender | Age |
---|---|---|---|---|
A101 | Jon | Jones | Male | NULL |
A103 | Angelina | Jolie | Female | 40 |
I'm fairly new to SQL and I initially tried
SELECT *
FROM Table1
UNION
SELECT *
FROM Table2
But obviously, the A101 row has different NULL values so it doesn't get treated as duplicates and I get:
EmployeeID | FirstName | LastName | Gender | Age |
---|---|---|---|---|
A100 | Bob | Odenkirk | Male | 30 |
A101 | Jon | Jones | NULL | 36 |
A101 | Jon | Jones | Male | NULL |
A103 | Angelina | Jolie | Female | 40 |
Is there a general way (i.e. if the table is large and not sure which values might be missing from Table1 or Table2) to somehow "fill in" the NULL values and get the following target output:
EmployeeID | FirstName | LastName | Gender | Age |
---|---|---|---|---|
A100 | Bob | Odenkirk | Male | 36 |
A101 | Jon | Jones | Male | 30 |
A103 | Angelina | Jolie | Female | 40 |
Assuming you can never have "competing" values (i.e., both tables either have the same value or one of them has a null
), you could union all
the two queries, and then group by the EmployeeID
and use max
to get the present values and ignore null
s:
SELECT EmployeeID, MAX(FirstName), MAX(LastName), MAX(Gender), MAX(Age)
FROM (SELECT EmployeeID, FirstName, LastName, Gender, Age
FROM Table1
UNION ALL
SELECT EmployeeID, FirstName, LastName, Gender, Age
FROM Table2) T
GROUP BY EmployeeID