Search code examples
sqlsql-serverdatetimeinner-join

Query on two tables with overlapping dates


scale_table is a table of employee ID's, their pay scale, and the start/end dates that the pay scale was valid for:

empl_id scale_id  date_from   date_to    
------- --------- ----------- -----------
187     B3EL9     2014-03-01  2017-06-30  
187     B4EL6     2017-07-01  2019-10-31  
187     B5EL9     2019-11-01  2099-12-31  
214     M115      2006-10-01  2099-12-31  
618     B3L9      2014-01-01  2019-10-31  
618     B6L9      2019-11-01  2099-12-31  

value_table lists all of the pay scales, pay amount, and the start/end dates that the amount was valid for that pay scale:

scale_id  amount    date_from   date_to    
--------- --------- ----------- -----------
B3EL9     78084.00  2013-01-01  2015-06-30  
B3EL9     81432.00  2015-07-01  2099-12-31  
B4EL6     78348.00  2013-01-01  2015-06-30  
B4EL6     81720.00  2015-07-01  2099-12-31  
B5EL9     95964.00  2013-01-01  2015-06-30  
B5EL9     100092.00 2015-07-01  2099-12-31  
B3L9      52728.00  2013-01-01  2015-08-15  
B3L9      54996.00  2015-08-16  2017-11-30  
B3L9      56100.00  2017-12-01  2020-11-15  
B3L9      56664.00  2020-11-16  2099-12-31  
B6L9      64140.00  2013-01-01  2015-08-15  
B6L9      66900.00  2015-08-16  2017-11-30  
B6L9      68244.00  2017-12-01  2020-11-15  
B6L9      68928.00  2020-11-16  2099-12-31  
M115      108528.00 2012-07-01  2015-06-30  
M115      115128.00 2015-07-01  2099-12-31  

I need a query to find all changes in employee salaries between 2015-01-01 and current date. Query results should be ordered by employee then date_from.

Expected results for employee 187 is:

empl_id scale_id amount    date_from  date_to
------- -------- ------    ---------- -----------
187     B3EL9    78084.00  2015-01-01  2015-06-30  
187     B3EL9    81432.00  2015-07-01  2017-06-30  
187     B4EL6    81720.00  2017-07-01  2019-10-31  
187     B5EL9    100092.00 2019-11-01  2099-12-31  

Solution

  • This is a pretty classic overlapping date ranges problem.

    This is a great answer which covers it: Determine Whether Two Date Ranges Overlap

    For your code, it would be something like this (see bottom of post for code used to generate the temp tables):

    DECLARE @StartDate date = '2015-01-01',
            @EndDate date = '2020-12-14';
    
    SELECT s.empl_id
         , s.scale_id
         , v.amount
         , StartDate    = IIF(x.EmpScaleStartDate < @StartDate, @StartDate, x.EmpScaleStartDate) -- Clamp the start date
         , EndDate      = x.EmpScaleEndDate
    FROM #scale_table s
        JOIN #value_table v ON v.scale_id = s.scale_id
        CROSS APPLY (
            SELECT EmpScaleStartDate    = IIF(v.date_from <= s.date_from, s.date_from, v.date_from) -- Pick the valid start date
                ,  EmpScaleEndDate      = IIF(s.date_to   <= v.date_to  , s.date_to  , v.date_to) -- Pick the valid end date
        ) x
    WHERE (s.date_from <= v.date_to) AND (s.date_to >= v.date_from) -- Do the two ranges overlap?
        AND (s.date_to >= @StartDate AND s.date_from <= @EndDate) -- Only look at records within our target range
    ORDER BY s.empl_id, s.date_from
    

    For emp 187, this will output:

    | empl_id | scale_id | amount    | StartDate  | EndDate    | 
    |---------|----------|-----------|------------|------------| 
    | 187     | B3EL9    | 78084.00  | 2015-01-01 | 2015-06-30 | 
    | 187     | B3EL9    | 81432.00  | 2015-07-01 | 2017-06-30 | 
    | 187     | B4EL6    | 81720.00  | 2017-07-01 | 2019-10-31 | 
    | 187     | B5EL9    | 100092.00 | 2019-11-01 | 2099-12-31 | 
    

    Explanation:

    You're basically dealing with 3 separate date ranges and you want to find where they all overlap.

    Those date ranges being:

    • The start/end date for the pay scales amount
    • The start/end date for the period the pay scale was assigned to the employee
    • The start/end date of the report

    The first step is to only grab valid records:

    DECLARE @StartDate date = '2015-01-01', @EndDate date = '2020-12-14';
    SELECT *
    FROM #scale_table s
    WHERE s.date_to >= @StartDate AND s.date_from <= @EndDate
    

    This is saying:

    Make sure we only pull employee records where their scale dates are within the range we care about. Due to the data you supplied, that's every employee record.


    Next:

    We want to join #value_table to these records so that we can see if there were any changes to each of those pay scales during the time they had it.

    DECLARE @StartDate date = '2015-01-01', @EndDate date = '2020-12-14';
    SELECT *
    FROM #scale_table s
        JOIN #value_table v ON v.scale_id = s.scale_id
    WHERE (s.date_from <= v.date_to) AND (s.date_to >= v.date_from) -- Do the two ranges overlap?
        AND (s.date_to >= @StartDate AND s.date_from <= @EndDate) -- Only look at records within our target range
    

    Now we have a dataset we can play with. We have a record of every employee and their pay scale history, as well as the changes to the scales themselves.

    Now we just have to figure out how to get the right Start/End dates...


    Next:

    That's where this comes in:

    CROSS APPLY (
        SELECT EmpScaleStartDate    = IIF(v.date_from <= s.date_from, s.date_from, v.date_from)
            ,  EmpScaleEndDate      = IIF(v.date_to   >= s.date_to  , s.date_to  , v.date_to)
    ) x
    

    This logic is deciding which start or end date to display. I only use a CROSS APPLY so that it is easier to re-use this logic throughout the query, and to make it a little more readable so you don't have a bunch of nested functions in a single line.

    If 187 had B3EL9 from 2014-03-01 to 2017-06-30

    And the pay scale paid $78,084 from 2013-01-01 to 2015-06-30

    Then we should show a start date of 2014-03-01 and an end date of 2015-06-30 for that row.


    The last step:

    Clamping the first date. Clamping means to bind one value within the range of two other values. Since this report is being run based on 2015-01-01 to Current. We want ranges which begin in 2013 or 2014 to show 2015-01-01 instead.

    That's all this is doing:

    SELECT StartDate = IIF(x.EmpScaleStartDate < @StartDate, @StartDate, x.EmpScaleStartDate)
    

    Sample Data in SQL form:

    IF OBJECT_ID('tempdb..#scale_table','U') IS NOT NULL DROP TABLE #scale_table; --SELECT * FROM #scale_table
    CREATE TABLE #scale_table (
        empl_id     int             NOT NULL,
        scale_id    varchar(100)    NOT NULL,
        date_from   date            NOT NULL,
        date_to     date            NOT NULL
    );
    
    INSERT INTO #scale_table (empl_id, scale_id, date_from, date_to)
    VALUES (187,'B3EL9', '2014-03-01','2017-06-30')
        ,  (187,'B4EL6', '2017-07-01','2019-10-31')
        ,  (187,'B5EL9', '2019-11-01','2099-12-31')
        ,  (214,'M115' , '2006-10-01','2099-12-31')
        ,  (618,'B3L9' , '2014-01-01','2019-10-31')
        ,  (618,'B6L9' , '2019-11-01','2099-12-31');
    
    IF OBJECT_ID('tempdb..#value_table','U') IS NOT NULL DROP TABLE #value_table; --SELECT * FROM #value_table
    CREATE TABLE #value_table (
        scale_id    varchar(100)    NOT NULL,
        amount      decimal(10, 2)  NOT NULL,
        date_from   date            NOT NULL,
        date_to     date            NOT NULL
    );
    
    INSERT INTO #value_table (scale_id, amount, date_from, date_to)
    VALUES ('B3EL9',78084.00  ,'2013-01-01', '2015-06-30')
        ,  ('B3EL9',81432.00  ,'2015-07-01', '2099-12-31')
        ,  ('B4EL6',78348.00  ,'2013-01-01', '2015-06-30')
        ,  ('B4EL6',81720.00  ,'2015-07-01', '2099-12-31')
        ,  ('B5EL9',95964.00  ,'2013-01-01', '2015-06-30')
        ,  ('B5EL9',100092.00 ,'2015-07-01', '2099-12-31')
        ,  ('B3L9 ',52728.00  ,'2013-01-01', '2015-08-15')
        ,  ('B3L9 ',54996.00  ,'2015-08-16', '2017-11-30')
        ,  ('B3L9 ',56100.00  ,'2017-12-01', '2020-11-15')
        ,  ('B3L9 ',56664.00  ,'2020-11-16', '2099-12-31')
        ,  ('B6L9 ',64140.00  ,'2013-01-01', '2015-08-15')
        ,  ('B6L9 ',66900.00  ,'2015-08-16', '2017-11-30')
        ,  ('B6L9 ',68244.00  ,'2017-12-01', '2020-11-15')
        ,  ('B6L9 ',68928.00  ,'2020-11-16', '2099-12-31')
        ,  ('M115 ',108528.00 ,'2012-07-01', '2015-06-30')
        ,  ('M115 ',115128.00 ,'2015-07-01', '2099-12-31');