I am trying to collect some average survey scores on employees by location over time. These people can transfer locations over time for various reasons. I have 2 tables.
Survey_Scores
: this table houses individual survey scores. It contains 3 million rows.
Sample:
Survey_Date | Employee_Number | Score (1-4) |
---|---|---|
1/2/2023 | 1 | 3 |
1/5/2023 | 2 | 4 |
Employee_Info
: this table houses info about the employees. This info can change over time and this history is kept by having multiple lines for one person in this table. Their rows can be distinguished by the columns record start and record end dates. When the record end date is null, that is their most current information.
Sample showing 2 employees who traded locations on January 4th:
Employee_Number | Location | Record_Start_Date | Record_End_Date |
---|---|---|---|
1 | SLC | 1/1/2023 | 1/3/2023 |
2 | TEX | 1/1/2023 | 1/3/2023 |
1 | TEX | 1/4/2023 | (null) |
2 | SLC | 1/4/2023 | (null) |
PROBLEM: I need to create a view that starts as the Survey_Scores
table, but with an added column that denotes the location the employee belonged to as of the Survey_Date
.
I tried joining the tables on the employee_Number
columns, but got an error in snowflake due to the many to many relationship here. I understand why it isn't working, because it doesn't know which row to look at, so I'm wondering if there is some workaround where I can make it return the location of where that person was on the survey_date
.
My initial thought beyond what i have already done is to somehow work in an if statement that says: if the survey_date
is between record_start_date
and record_end_date
, then return location. But I'm not quite sure how to get going on implementing that.
I tried joining the tables on the employee_Number fields but got an error in snowflake due to the many to many relationship here
SQL allows to join on more than one field/predicate:
SELECT s.*, e.Location
FROM Survey_Scores AS s
LEFT JOIN Employee_Info AS e
ON s.Employee_Number = e.Employee_Number
AND s.Survey_Date BETWEEN e.Record_Start_Date
AND COALESCE(e.Record_End_Date, '2999-12-31'::DATE);