Search code examples
sql-serversql-server-2008rankrow-numbercontiguous

T-SQL to create an ID column


I'm using SQL Server 2008 R2 and I have the following dataset:

+---------+--------------+--------------+----------+------------+------------+
| Dossier | refmouvement | refadmission | refunite |  datedeb   |  datefin   |
+---------+--------------+--------------+----------+------------+------------+
| P001234 |         2567 |         1234 |      227 | 2012-01-01 | 2012-01-02 |
| P001234 |         2568 |         1234 |      227 | 2012-01-02 | 2012-01-03 |
| P001234 |         2569 |         1234 |      224 | 2012-01-03 | 2012-01-06 |
| P001234 |         2570 |         1234 |      232 | 2012-01-06 | 2012-01-10 |
| P001234 |         2571 |         1234 |      232 | 2012-01-10 | 2012-01-15 |
| P001234 |         2572 |         1234 |      232 | 2012-01-15 | 2012-01-20 |
| P001234 |         2573 |         1234 |      232 | 2012-01-20 | 2012-01-25 |
| P001234 |         2574 |         1234 |      224 | 2012-01-25 | 2012-01-29 |
| P001234 |         2575 |         1234 |      227 | 2012-01-29 | 2012-02-05 |
| P001234 |         2576 |         1234 |      227 | 2012-02-05 | 2012-02-10 |
| P001234 |         2577 |         1234 |      232 | 2012-02-10 | 2012-02-15 |
| P001234 |         2578 |         1234 |      201 | 2012-02-15 | 2012-02-26 |
+---------+--------------+--------------+----------+------------+------------+

This dataset is ordered by datedeb, otherwise known as startdate.

As you can notice this is a contiguous dataset where datefin is equal to the next line's datedeb

I need to create an ID column that is going to give an unique ID based on the refunite and the datedeb columns like this:

+----+---------+--------------+--------------+----------+------------+------------+
| ID | Dossier | refmouvement | refadmission | refunite |  datedeb   |  datefin   |
+----+---------+--------------+--------------+----------+------------+------------+
|  1 | P001234 |         2567 |         1234 |      227 | 2012-01-01 | 2012-01-02 |
|  1 | P001234 |         2568 |         1234 |      227 | 2012-01-02 | 2012-01-03 |
|  2 | P001234 |         2569 |         1234 |      224 | 2012-01-03 | 2012-01-06 |
|  3 | P001234 |         2570 |         1234 |      232 | 2012-01-06 | 2012-01-10 |
|  3 | P001234 |         2571 |         1234 |      232 | 2012-01-10 | 2012-01-15 |
|  3 | P001234 |         2572 |         1234 |      232 | 2012-01-15 | 2012-01-20 |
|  3 | P001234 |         2573 |         1234 |      232 | 2012-01-20 | 2012-01-25 |
|  4 | P001234 |         2574 |         1234 |      224 | 2012-01-25 | 2012-01-29 |
|  5 | P001234 |         2575 |         1234 |      227 | 2012-01-29 | 2012-02-05 |
|  5 | P001234 |         2576 |         1234 |      227 | 2012-02-05 | 2012-02-10 |
|  6 | P001234 |         2577 |         1234 |      232 | 2012-02-10 | 2012-02-15 |
|  7 | P001234 |         2578 |         1234 |      201 | 2012-02-15 | 2012-02-26 |
+----+---------+--------------+--------------+----------+------------+------------+

I just can't wrap my head around a RANK(), ROW_NUMBER() or DENSE_RANK() function or a combination of that could achieve this, I have looked everywhere but I cannot find anything, maybe I'm not using the proper keywords but I just can't figure it out

Any help will be appreciated

Thanks.

Here's the code that I've tried so far:

SELECT 
   ROW_NUMBER() over(order by t1.[datedeb])  as [ID1],
   dense_Rank() over(partition by t1.[refunite]   order by t1.[datedeb])  as [ID2],
   t1.[Dossier]
   ,t1.[refmouvement]
   ,t1.[refadmission]
   ,t1.[refunite]
   ,t1.[datedeb]
   ,t1.[datefin]
   ,t2.[refmouvement] as [prev_refmouvement]
   ,t2.refunite as prev_refunite
FROM [sometable] t1
LEFT OUTER JOIN [sometable] t2  /*self join*/
     ON t2.datefin = t1.datedeb
        AND t1.[refadmission] = t2.[refadmission]
ORDER BY
   t1.[datedeb]

This is what it gives me :

+-----+-----+---------+--------------+--------------+----------+------------+------------+-------------------+---------------+
| ID1 | ID2 | Dossier | refmouvement | refadmission | refunite |  datedeb   |  datefin   | prev_refmouvement | prev_refunite |
+-----+-----+---------+--------------+--------------+----------+------------+------------+-------------------+---------------+
|   1 |   1 | P001234 |         2567 |         1234 |      227 | 2012-01-01 | 2012-01-02 | NULL              | NULL          |
|   2 |   2 | P001234 |         2568 |         1234 |      227 | 2012-01-02 | 2012-01-03 | 2567              | 227           |
|   3 |   1 | P001234 |         2569 |         1234 |      224 | 2012-01-03 | 2012-01-06 | 2568              | 227           |
|   4 |   1 | P001234 |         2570 |         1234 |      232 | 2012-01-06 | 2012-01-10 | 2569              | 224           |
|   5 |   2 | P001234 |         2571 |         1234 |      232 | 2012-01-10 | 2012-01-15 | 2570              | 232           |
|   6 |   3 | P001234 |         2572 |         1234 |      232 | 2012-01-15 | 2012-01-20 | 2571              | 232           |
|   7 |   4 | P001234 |         2573 |         1234 |      232 | 2012-01-20 | 2012-01-25 | 2572              | 232           |
|   8 |   2 | P001234 |         2574 |         1234 |      224 | 2012-01-25 | 2012-01-29 | 2573              | 232           |
|   9 |   3 | P001234 |         2575 |         1234 |      227 | 2012-01-29 | 2012-02-05 | 2574              | 224           |
|  10 |   4 | P001234 |         2576 |         1234 |      227 | 2012-02-05 | 2012-02-10 | 2575              | 227           |
|  11 |   5 | P001234 |         2577 |         1234 |      232 | 2012-02-10 | 2012-02-15 | 2576              | 227           |
|  12 |   1 | P001234 |         2578 |         1234 |      201 | 2012-02-15 | 2012-02-26 | 2577              | 232           |
+-----+-----+---------+--------------+--------------+----------+------------+------------+-------------------+---------------+

Shaz


Solution

  • DECLARE @Results TABLE(
        RowNum INT PRIMARY KEY,
        refunite INT NOT NULL,
        datedeb DATETIME NOT NULL
    );
    
    INSERT  @Results (RowNum, refunite, datedeb)
    SELECT  ROW_NUMBER() OVER(ORDER BY datedeb) AS RowNum,
            refunite, 
            datedeb
    FROM    dbo.MyTable;
    
    WITH CTERecursive
    AS (
        SELECT  crt.RowNum,
                crt.refunite,
                crt.datedeb,
                1 AS Rnk -- Starting rank
        FROM    @Results crt
        WHERE   crt.RowNum = 1
        UNION ALL
        SELECT  crt.RowNum,
                crt.refunite,
                crt.datedeb,
                CASE WHEN prev.refunite = crt.refunite THEN prev.Rnk ELSE prev.Rnk + 1 END
        FROM    @Results crt INNER JOIN CTERecursive prev ON crt.RowNum = prev.RowNum + 1
    )
    SELECT  *
    FROM    CTERecursive
    -- OPTION(MAXRECURSION 1000); -- Uncomment this line if you change the number of recursion levels allowed (default 100)
    

    Results:

    RowNum      refunite    datedeb                 Rnk
    ----------- ----------- ----------------------- ---
    1           227         2012-01-01 00:00:00.000 1
    2           227         2012-01-02 00:00:00.000 1
    3           224         2012-01-03 00:00:00.000 2
    4           232         2012-01-06 00:00:00.000 3
    5           232         2012-01-10 00:00:00.000 3
    6           232         2012-01-15 00:00:00.000 3
    7           232         2012-01-20 00:00:00.000 3
    8           224         2012-01-25 00:00:00.000 4
    9           227         2012-01-29 00:00:00.000 5
    10          227         2012-02-05 00:00:00.000 5
    11          232         2012-02-10 00:00:00.000 6
    12          201         2012-02-15 00:00:00.000 7