I have multiple datasets having different rows and fields.
dataset1
Customer_ID Date Category Address City School
4154124 1/2/2021 A balboa st. Canterbury Middleton
2145124 1/2/2012 A somewhere world St. Augustine
1621573 1/2/2012 A my_street somewhere St. Augustine
dataset2
Customer_ID Date Category Country Zipcode
14123 12/12/2020 B UK EW
416412 14/12/2020 B ES
dataset3
Customer_ID Date Category School University
4124123 07/12/2020 C Middleton Oxford
I would like a final dataset which includes all the columns (keeping only one copy of the common ones):
Customer_ID Date Category Address City School Country Zipcode University
4154124 1/2/2021 A balboa st. Canterbury Middleton
2145124 1/2/2012 A somewhere world St. Augustine
1621573 1/2/2012 A my_street somewhere St. Augustine
14123 12/12/2020 B UK EW
416412 14/12/2020 B ES
4124123 07/12/2020 C Middleton Oxford
would a left join be the best way to get the expected output? How I can keep Customer_ID Date and Category and duplicates column (e.g., School) only once?
You can achieve this using UNION ALL.
SELECT Customer_ID, Date, Category, Address, City, School, '' AS Country, '' AS ZipCode, '' AS university FROM dataset1
UNION ALL
SELECT Customer_ID, Date, Category, '', '', '', Country, Zipcode, '' FROM dataset2
UNION ALL
SELECT Customer_ID, Date, Category, '', '', School, '', '', University FROM dataset3