Search code examples
t-sqlrow-numberdata-partitioning

OVER PARTITION BY without ordering


I have a table from which I would like to retrieve only the most recent record within each group of records, as marked by the value in a particular field.

The table content looks something like this:

state   date
512     2021-03-09 11:31:38.300
512     2021-03-09 11:31:38.300
512     2021-03-09 11:31:31.693
512     2021-03-09 11:31:31.693
512     2021-03-08 12:49:10.753
512     2021-03-08 12:35:47.357
514     2021-03-08 12:35:01.030
512     2021-03-08 12:33:48.050
514     2021-03-08 12:14:29.537
514     2021-03-08 12:14:29.537
514     2021-03-08 12:14:18.760
512     2021-03-08 12:14:05.597

I would like to use OVER and PARTITION to SELECT output like this:

row state   date
1   512     2021-03-09 11:31:38.300
2   512     2021-03-09 11:31:38.300
3   512     2021-03-09 11:31:31.693
4   512     2021-03-09 11:31:31.693
5   512     2021-03-08 12:49:10.753
6   512     2021-03-08 12:35:47.357
1   514     2021-03-08 12:35:01.030
1   512     2021-03-08 12:33:48.050
1   514     2021-03-08 12:14:29.537
2   514     2021-03-08 12:14:29.537
3   514     2021-03-08 12:14:18.760
1   512     2021-03-08 12:14:05.597

As you can see, the rows are ordered by date DESC and the state field is grouped by virtue of the row field starting at 1 for each change in the state field.

Currently, my code looks like this:

with query as
(
    select state, date, row = row_number() over (partition by state order by date desc)
    from table
)
select t.*
from table t
inner join query q on t.state = q.state and t.date = q.date
where row = 1
order by t.date desc

Unfortunately, this appears to group the records by state before ordering them by date DESC, so the result is only two result set records because there are only two different values in the state field. There should (for the example data above) be 5 resultset records.

How can I number the partition groups properly?


Solution

  • This works, though there may be a simpler way to do it.

    WITH
    cte1 AS
    (
        SELECT 
            ROW_NUMBER() OVER(ORDER BY date_ DESC) as dateRow,
            ROW_NUMBER() OVER(PARTITION BY state ORDER BY date_ DESC) as stateRow,
            state,
            date_
        FROM    StateDate
    )
    SELECT 
        ROW_NUMBER() OVER(PARTITION BY state, (dateRow - stateRow)  ORDER BY date_ DESC) as row,
        state, date_
    FROM cte1
    ORDER BY date_ DESC, state
    

    Here's the data setup I used:

    CREATE TABLE StateDate( state INT, date_ DATETIME)
    GO
    
    --state   date
    INSERT INTO StateDate VALUES (512, '2021-03-09 11:31:38.300');
    INSERT INTO StateDate VALUES (512, '2021-03-09 11:31:38.300');
    INSERT INTO StateDate VALUES (512, '2021-03-09 11:31:31.693');
    INSERT INTO StateDate VALUES (512, '2021-03-09 11:31:31.693');
    INSERT INTO StateDate VALUES (512, '2021-03-08 12:49:10.753');
    INSERT INTO StateDate VALUES (512, '2021-03-08 12:35:47.357');
    INSERT INTO StateDate VALUES (514, '2021-03-08 12:35:01.030');
    INSERT INTO StateDate VALUES (512, '2021-03-08 12:33:48.050');
    INSERT INTO StateDate VALUES (514, '2021-03-08 12:14:29.537');
    INSERT INTO StateDate VALUES (514, '2021-03-08 12:14:29.537');
    INSERT INTO StateDate VALUES (514, '2021-03-08 12:14:18.760');
    INSERT INTO StateDate VALUES (512, '2021-03-08 12:14:05.597');
    GO