Search code examples
sqlsql-servert-sqlwindow-functions

Select top rows until value in specific column has appeared twice


I have the following query where I am trying to select all records, ordered by date, until the second time EmailApproved = 1 is found. The second record where EmailApproved = 1 should not be selected.

declare @Test table (Id int, EmailApproved bit, Created datetime);

insert into @Test (Id, EmailApproved, Created)
values
  (1,0,'2011-03-07 03:58:58.423')
  , (2,0,'2011-02-21 04:55:52.103')
  , (3,0,'2011-01-29 13:24:02.103')
  , (4,1,'2010-10-12 14:41:54.217')
  , (5,0,'2010-10-12 14:34:15.903')
  , (6,0,'2010-10-12 10:10:19.123')
  , (7,1,'2010-08-27 12:07:16.073')
  , (8,1,'2010-08-25 12:15:49.413')
  , (9,0,'2010-08-25 12:14:51.970')
  , (10,1,'2010-04-12 16:43:44.777');

select *
  , case when Row1 = Row2 then 1 else 0 end Row1EqualRow2
from (
  select Id, EmailApproved, Created
    , row_number() over (partition by EmailApproved order by Created desc) Row1
    , row_number() over (order by Created desc) Row2
  from @Test
) X
--where Row1 = Row2
order by Created desc;

Which produces the following results:

Id EmailApproved Created Row1 Row2 Row1EqualsRow2
1 0 2011-03-07 03:58:58.423 1 1 1
2 0 2011-02-21 04:55:52.103 2 2 1
3 0 2011-01-29 13:24:02.103 3 3 1
4 1 2010-10-12 14:41:54.217 1 4 0
5 0 2010-10-12 14:34:15.903 4 5 0
6 0 2010-10-12 10:10:19.123 5 6 0
7 1 2010-08-27 12:07:16.073 2 7 0
8 1 2010-08-25 12:15:49.413 3 8 0
9 0 2010-08-25 12:14:51.970 6 9 0
10 1 2010-04-12 16:43:44.777 4 10 0

What I actually want is:

Id EmailApproved Created Row1 Row2 Row1EqualsRow2
1 0 2011-03-07 03:58:58.423 1 1 1
2 0 2011-02-21 04:55:52.103 2 2 1
3 0 2011-01-29 13:24:02.103 3 3 1
4 1 2010-10-12 14:41:54.217 1 4 0
5 0 2010-10-12 14:34:15.903 4 5 0
6 0 2010-10-12 10:10:19.123 5 6 0

Note: Row, Row2 & Row1EqualsRow2 are just working columns to show my calculations.


Solution

  • Steps:

    1. Create a row number, rn, over all rows in case id is not in sequence.
    2. Create a row number, approv_rn, partitioned by EmailApproved so we know when EmailApproved = 1 for the second time
    3. Use a outer apply to find the row number of the second instance of EmailApproved = 1
    4. In the where clause filter out all rows where the row number is >= the value found in step 3.
    5. If there is 1 or 0 EmailApproved records available then the outer apply will return null, in which case return all available rows.
    with test as
    (
        select  *, 
                rn         = row_number() over (order by Created desc),
                approv_rn  = row_number() over (partition by EmailApproved 
                                                    order by Created desc)
        from    @Test
    )
    select  *
    from    test t
            outer apply
            (
                select  x.rn
                from    test x
                where   x.EmailApproved = 1
                and     x.approv_rn     = 2
            ) x
    where   t.rn    < x.rn or x.rn is null
    order by t.Created desc;