Search code examples
mysqlsqldatetimedate-arithmeticdomo

Self Join? Were Staff Who Worked the Previous Week Active 3 Weeks ago - MYSQL


I'm trying to add a column to a production hours dataset that will tell if a provider who worked last week was also working three weeks earlier. The current dataset looks something like this:

RowID | ProviderID | ClientID |     DOS    |   DOS (Week)  | Hours 
  1   | 1111111111 | 22222222 | 11/2/2020  |   11/1/2020   | 2.5 
  2   | 1111111111 | 33333333 | 11/5/2020  |   11/1/2020   | 1 
  3   | 1111111111 | 44444444 | 10/13/2020 |   10/11/2020  | 3 

I'm trying to get an extra column 'Active 3 Weeks Prior' with y/n or 1/0 for values. For the above table, let's assume the provider started on 10/13/20. The new column would ideally populate like this:

RowID | ProviderID | ClientID |     DOS    |   DOS (Week)  | Hours | Active 3 weeks Prior 
  1   | 1111111111 | 22222222 | 11/2/2020  |   11/1/2020   | 2.5   |   Yes              
  2   | 1111111111 | 33333333 | 11/5/2020  |   11/1/2020   | 1     |   Yes
  3   | 1111111111 | 44444444 | 10/13/2020 |   10/11/2020  | 3     |   No

A couple extra tidbits: our org uses Sunday as the start of the week so DOS (Week) is the Sunday prior to the date of service. From what I've been reading so far, it seems like the solution here is some kind of self join, where the base production records are aggregated into weekly hours and compared with that same providerID's records for DOS (Week) - 21.

The trouble I'm having is: whether I'm on the right track in the first place with the self-join and how I would generate the y/n values based on the success or failure to find a matching value. Also, I suspect that joining based on a concatenate of ProviderID and DOS(Week) might be flawed? This is what I've been playing with so far.

Please let me know if I can clarify the question at all or am missing something very obvious. I truly appreciate any help, as I've been trying to figure out the right search terms to get a clue on the answer for a few days now.


Solution

  • If you are running MySQL 8.0, you can use window functions and a range specification:

    select t.*,
        (
            max(providerid) over(
                partition by providerid 
                order by dos
                range between interval 3 week preceding and interval 3 week preceding
            ) is not null
        ) as active_3_weeks_before
    from mytable t
    

    It is not really clear from your explanation and data what you mean by was also working three weeks earlier. What the query does is, for each row, to check if another row exists with the same supplier and a dos that is exactly 3 week before the dos of the current row. This can easily be adapted for some other requirement.


    Edit: if you want to check for any record within the last 3 weeks, you would change the window range to:

    range between interval 3 week preceding and interval 1 day preceding
    

    And if you want this in MySQL < 8.0, where window functions are not available, then you would use a correlated subquery:

    select t.*,
        exists (
            select 1
            from mytable t1
            where 
                t1.providerid = t.provider_id
                and t1.dos >= t.dos - interval 3 week
                and t1.dos <  t.dos
        ) as active_3_weeks_before
    from mytable t