Search code examples
sqljoinsnowflake-cloud-data-platform

SQL Left Join in a many to many relationship situation


I am trying to collect some average survey scores on employees by location over time. These people can transfer locations over time for various reasons. I have 2 tables.

Survey_Scores: this table houses individual survey scores. It contains 3 million rows.

Sample:

Survey_Date Employee_Number Score (1-4)
1/2/2023 1 3
1/5/2023 2 4

Employee_Info: this table houses info about the employees. This info can change over time and this history is kept by having multiple lines for one person in this table. Their rows can be distinguished by the columns record start and record end dates. When the record end date is null, that is their most current information.

Sample showing 2 employees who traded locations on January 4th:

Employee_Number Location Record_Start_Date Record_End_Date
1 SLC 1/1/2023 1/3/2023
2 TEX 1/1/2023 1/3/2023
1 TEX 1/4/2023 (null)
2 SLC 1/4/2023 (null)

PROBLEM: I need to create a view that starts as the Survey_Scores table, but with an added column that denotes the location the employee belonged to as of the Survey_Date.

I tried joining the tables on the employee_Number columns, but got an error in snowflake due to the many to many relationship here. I understand why it isn't working, because it doesn't know which row to look at, so I'm wondering if there is some workaround where I can make it return the location of where that person was on the survey_date.

My initial thought beyond what i have already done is to somehow work in an if statement that says: if the survey_date is between record_start_date and record_end_date, then return location. But I'm not quite sure how to get going on implementing that.


Solution

  • I tried joining the tables on the employee_Number fields but got an error in snowflake due to the many to many relationship here

    SQL allows to join on more than one field/predicate:

    SELECT s.*, e.Location
    FROM Survey_Scores AS s
    LEFT JOIN Employee_Info AS e
      ON s.Employee_Number = e.Employee_Number
     AND s.Survey_Date BETWEEN e.Record_Start_Date 
                           AND COALESCE(e.Record_End_Date, '2999-12-31'::DATE);