sqlsql-server

How to duplicate a record and change its start/end date based on the start/end on a joined table?


I have a Savings table where each row shows a software module, a stakeholder, and the amount of time that the module saves the stakeholder by automating one of their tasks. Each row shows how much time the task takes to do manually vs automatically, how often the task is performed (eg 2x per week), and an effective start and end date. For example:

Module Stakeholder Manual Task Time (min) Automated Task Time (min) Times Task Performed Frequency Effective Start Effective End
Task 1 Automation John Doe 1440 0.5 4 Week 4/6/2023 8/1/2023
Task 2 Automation Jane Smith 480 1 1 Day 10/29/2022 10/1/2023
Task 3 Automation Billy 2880 1 1 Month 6/1/2022 12/1/2023
Task 4 Automation John Doe 260 1 3 Week 9/1/2023 12/1/2023
Task 5 Automation John Doe 300 2 2 Week 4/1/2022 7/1/2022

This table will be joined to a Stakeholder table that includes their name, base salary, and an effective start and end date. If a stakeholder's salary changes, a new record is created for them, and the most current record is indicated by a NULL in the "Effective End" date field. For example:

Stakeholder Base Salary Effective Start Effective End
John Doe 60,000 1/1/2022 8/1/2023
John Doe 80,000 8/1/2023
Jane Smith 100,000 10/1/2022
Billy 95,000 1/1/2022

After joining these two tables, a few calculated fields will be added. These fields will convert the base salary to an hourly rate, and calculate how much time and labor costs were saved by the module during the effective date range.

But if a stakeholder has a salary change between the Savings "Effective Start" and "Effective End" dates, then both salaries need to be applied. For example, with "Task 1 Automation", the salary used in the calculation needs to be 60,000 until 8/1/2023, then 80,000 from 8/1/2023 until 12/1/2023. So the row needs to appear twice, with its "Effective Start" and "Effective End" dates updated accordingly. The output table should look like this:

Module Stakeholder Manual Task Time (min) Automated Task Time (min) Times Task Performed Frequency Effective Start Effective End Salary
Task 1 Automation John Doe 1440 0.5 4 Week 4/6/2023 8/1/2023 60,000
Task 1 Automation John Doe 1440 0.5 4 Week 8/1/2023 12/1/2023 80,000
Task 2 Automation Jane Smith 480 1 1 Day 10/29/2022 10/1/2023 100,000
Task 3 Automation Billy 2880 1 1 Month 6/1/2022 12/1/2023 95,000
Task 4 Automation John Doe 260 1 3 Week 9/1/2023 12/1/2023 80,000
Task 5 Automation John Doe 300 2 2 Week 4/1/2022 7/1/2022 60,000

What's the best way to do this?

EDIT

A few records added to the first sample table, and the expected salary added to the last sample table, for clarification

Clarification notes

  • Task 1 Automation -- Both salaries apply
  • Task 4 Automation -- Only the later salary applies
  • Task 5 Automation -- Only the earlier salary applies

The "Task 1 Automation" record should appear twice, because John Doe had a salary change in the middle of its effective date range. So the first copy should only include the slice of that date range where John Doe had a 60,000 salary. The second copy should only include the slice where John Doe had a 80,000 salary.

The "Task 4 Automation" record should only appear once with an 80,000 salary, because John Doe had a salary of 80,000 during the entire effective date range.

The "Task 5 Automation" record should only appear once with a 60,000 salary, because John Doe had a salary of 60,000 during the entire effective date range.


Solution

  • The following query solves your problem as asked. However it doesn't return 12/1/2023 for the second row as you have put in your desired results, but I cannot see where it should get it from either, so I suspect that is incorrect.

    The key point you were missing was putting the date matching criteria in your join condition. Once you add that you get the rows you desire.

    select
        s.[Module]
        , s.Stakeholder
        , s.ManualTaskMin
        , s.AutomatedTaskMin
        , s.TimesPerformed
        , EffectiveStart = iif(sh.EffectiveStart > s.EffectiveStart, sh.EffectiveStart, s.EffectiveStart)
        , EffectiveEnd = iif(sh.EffectiveEnd < s.EffectiveEnd, sh.EffectiveEnd, s.EffectiveEnd)
    from #Stakeholder sh
    join #Savings s
        on sh.Stakeholder = s.Stakeholder
        and s.EffectiveEnd >= sh.EffectiveStart
        and (s.EffectiveStart < sh.EffectiveEnd or sh.EffectiveEnd is null)
    order by s.[Module];
    

    Returns:

    Module Stakeholder ManualTaskMin AutomatedTaskMin TimesPerformed EffectiveStart EffectiveEnd
    Task 1 Automation John Doe 1440 0.5 4 2023-04-06 2023-08-01
    Task 1 Automation John Doe 1440 0.5 4 2023-08-01 2023-08-01
    Task 2 Automation Jane Smith 480 1.0 1 2022-10-29 2023-10-01
    Task 3 Automation Billy 2880 1.0 1 2022-06-01 2023-12-01
    Task 4 Automation John Doe 260 1.0 3 2023-09-01 2023-12-01
    Task 5 Automation John Doe 300 2.0 2 2022-04-01 2022-07-01

    DBFiddle