Search code examples
sql-serverrow-numberdata-partitioning

How to partition to spread values?


I have a table with data:

Customers

  • Sequence
  • ID
  • many other columns (not important)

Sample data:

Sequence ID
-----------
214906 2613
214906 2614
214906 2615
214907 2613
214907 2614
214907 2615
214908 2613
214908 2614
214908 2615
214000 2613
213004 4444
111111 5555
111111 5556
111112 5556 
111112 5555

How can I get the desired result below?

214906 2613
214907 2614
214908 2615
214000 2613
213004 4444
111111 5555
111112 5556

I tried various stuff with ROW_NUMBER() OVER(PARTITION BY Sequence) but it did not help because I need to take row 1 in first group, row 2 in second group etc. In other words, I need to somehow spread those Sequences across ID's. I cannot partition by ID's either because they might appear more than once in the table


Solution

  • I hope I understand you correctly. I use the count of IDs per sequence as a group factor (using SUM() with OVER clause without ORDER BY) and after that appropriate ranking and row numbering:

    Input:

    CREATE TABLE #Data (
        Sequence int,
        ID int
    )
    INSERT INTO #Data 
        (Sequence, ID)
    VALUES
        (214906, 2613),
        (214906, 2614),
        (214906, 2615),
        (214907, 2613),
        (214907, 2614),
        (214907, 2615),
        (214908, 2613),
        (214908, 2614),
        (214908, 2615),
        (214000, 2613),
        (213004, 4444),
        (111111, 5555),
        (111111, 5556),
        (111112, 5556), 
        (111112, 5555)
    

    T-SQL:

    ;WITH SequenceCTE AS (
        SELECT 
            *,
            COUNT(*) OVER (PARTITION BY Sequence) AS SequenceCnt
        FROM #Data
    ), RankCTE AS (
        SELECT 
            *,
            DENSE_RANK() OVER (PARTITION BY SequenceCnt, Sequence ORDER BY SequenceCnt, ID) AS RankNo,
            ROW_NUMBER() OVER (PARTITION BY SequenceCnt, ID ORDER BY Sequence, ID) AS RowNo
        FROM SequenceCTE 
    )
    SELECT Sequence, ID
    FROM RankCTE
    WHERE RankNo = RowNo
    

    Output:

    ----------------
    Sequence    ID
    ----------------
    214000      2613
    213004      4444
    111111      5555
    111112      5556
    214906      2613
    214907      2614
    214908      2615
    

    Update (special case with one ID in a sequence):

    ;WITH SequenceCTE AS (
        SELECT 
            *,
            COUNT(*) OVER (PARTITION BY Sequence) AS SequenceCnt
        FROM #Data
    ), RankCTE AS (
        SELECT 
            *,
            CASE 
                WHEN SequenceCnt = 1 THEN 1
                ELSE DENSE_RANK() OVER (PARTITION BY SequenceCnt, Sequence ORDER BY SequenceCnt, ID) 
            END AS RankNo,
            CASE 
                WHEN SequenceCnt = 1 THEN 1
                ELSE ROW_NUMBER() OVER (PARTITION BY SequenceCnt, ID ORDER BY Sequence, ID) 
            END AS RowNo
        FROM SequenceCTE 
    )
    SELECT Sequence, ID
    FROM RankCTE
    WHERE RankNo = RowNo