Search code examples
sqlsql-serverdatewindow-functionsgaps-and-islands

Subtract previous row to calculate column value in MS SQL


I have below data which needs to be formatted to display only consecutive days without a break.

For example, I want to display the records with dates from 2018-06-18 until 2018-06-21 as these are consecutive days. But then I don't want to show rows with dates of 2018-06-27 or 2018-07-03 as these days don't have a sequence. The exception is for Friday-Monday where there will be obvious gap. For example if the dates are 2018-06-21 (Thursday), 2018-06-22 (Friday) and 2018-06-25 (Monday), than that will be a valid sequence.

Sequence of 2018-6-1, 2018-6-3, 2018-6-4 has to be hidden as it is broken (there was no record of 2018-6-2 in the table).

I had written below, but it fails to deliver the output I am looking for. Any help will be greatly appreciated.

select date, diff, Percent_gain, day, month, Year
from (
   select date
      ,datediff(day, '2018-1-29',date) - coalesce(lag(datediff(day, '2018-1-29', date)) over (order by date), 0) as diff
      ,Percent_gain, day, month, year
   from tab
) t1
where t1.diff < 2
   and t1.Percent_gain < 0

enter image description here

Output

enter image description here


Solution

  • You can use window functions and a conditional expression:

    select *
    from (
        select t.*, lag(date) over(order by date) lag_date
        from mytable t
        where percent_gain < 0
    ) t
    where 
        lag_date is null
        or date = dateadd(day, case when datename(dw, date) = 'Monday' then 3 else 1 end, lag_date)
    

    The subquery filters on rows whose percent_gain is negative, and uses lag() to retrive the date of the "previous" row. Then, the outer query filters again according to the following criteria:

    • either the row is the "first" one (ie there is no previous date)

    • or the date of the current row one day after the previous date - excepted if the current date is a Monday, in which case the current date should be 3 days later than the previous date

    Note that the query does not actually uses columns day, month and year; those look like derived values, that can easily be computed on the fly when needed. Typically, we use datename() to get the day name.