Search code examples
sqlsql-serversql-server-2012row-numberdatabase-partitioning

Fill NULL value with progressive row_number over partition function


What I have

From the following #MyTable I just have Name and Number columns.

My goal is to fill the valus where Number = NULL with a progressive number and get the values I have wrote into the Desidered_col column.

+------+--------+---------------+
| Name | Number | Desidered_col |
+------+--------+---------------+
| John | 1      |             1 |
| John | 2      |             2 |
| John | 3      |             3 |
| John | NULL   |             4 |
| John | NULL   |             5 |
| John | 6      |             6 |
| Mike | 1      |             1 |
| Mike | 2      |             2 |
| Mike | NULL   |             3 |
| Mike | 4      |             4 |
| Mike | 5      |             5 |
| Mike | 6      |             6 |
+------+--------+---------------+

What I have tried

I have tried with the following query

SELECT Name, Number, row_number() OVER(PARTITION BY [Name] ORDER BY Number ASC) AS rn
FROM #MyTable

but it put all the NULL values first and then count the rows. How can I fill the empty values?

Why I don't think is a duplicate question

I have read this question and this question but I don't think it is duplicate because they don't consider the PARTITION BY construct.


This is the script to create and populate the table

SELECT * 
INTO #MyTable
FROM (
    SELECT 'John' AS [Name], 1 AS [Number], 1 AS [Desidered_col] UNION ALL
    SELECT 'John' AS [Name], 2 AS [Number], 2 AS [Desidered_col] UNION ALL
    SELECT 'John' AS [Name], 3 AS [Number], 3 AS [Desidered_col] UNION ALL
    SELECT 'John' AS [Name], NULL AS [Number], 4 AS [Desidered_col] UNION ALL
    SELECT 'John' AS [Name], NULL AS [Number], 5 AS [Desidered_col] UNION ALL
    SELECT 'John' AS [Name], 6 AS [Number], 6 AS [Desidered_col] UNION ALL
    SELECT 'Mike' AS [Name], 1 AS [Number], 1 AS [Desidered_col] UNION ALL
    SELECT 'Mike' AS [Name], 2 AS [Number], 2 AS [Desidered_col] UNION ALL
    SELECT 'Mike' AS [Name], NULL AS [Number], 3 AS [Desidered_col] UNION ALL
    SELECT 'Mike' AS [Name], 4 AS [Number], 4 AS [Desidered_col] UNION ALL
    SELECT 'Mike' AS [Name], 5 AS [Number], 5 AS [Desidered_col] UNION ALL
    SELECT 'Mike' AS [Name], 6 AS [Number], 6 AS [Desidered_col]
) A

Solution

  • This query is a bit complicated but seems to return your expected result. The only case it may be wrong is when someone does not have Number = 1.

    The idea is that you must find gaps between numbers and count how many null values can be used to fill them.

    Sample data

    create table #myTable (
        [Name] varchar(20)
        , [Number] int
    )
    
    insert into #myTable
    insert into #myTable
    SELECT 'John' AS [Name], 1 AS [Number] UNION ALL
    SELECT 'John' AS [Name], 2 AS [Number]UNION ALL
    SELECT 'John' AS [Name], 3 AS [Number] UNION ALL
    SELECT 'John' AS [Name], NULL AS [Number] UNION ALL
    SELECT 'John' AS [Name], NULL AS [Number] UNION ALL
    SELECT 'John' AS [Name], 6 AS [Number] UNION ALL
    SELECT 'Mike' AS [Name], 1 AS [Number] UNION ALL
    SELECT 'Mike' AS [Name], 2 AS [Number] UNION ALL
    SELECT 'Mike' AS [Name], NULL AS [Number] UNION ALL
    SELECT 'Mike' AS [Name], 4 AS [Number] UNION ALL
    SELECT 'Mike' AS [Name], 5 AS [Number] UNION ALL
    SELECT 'Mike' AS [Name], 6 AS [Number]
    

    Query

    ;with gaps_between_numbers as (
        select
            t.Name, cnt = t.nextNum - t.Number - 1, dr = dense_rank() over (partition by t.Name order by t.Number)
            , rn = row_number() over (partition by t.Name order by t.Number)
        from (
            select 
                Name, Number, nextNum = isnull(lead(Number) over (partition by Name order by number), Number + 1)
            from 
                #myTable
            where
                Number is not null
        ) t
        join master.dbo.spt_values v on t.nextNum - t.Number - 1 > v.number
        where
            t.nextNum - t.Number > 1
            and v.type = 'P'
    )
    , ordering_nulls as (
        select
            t.Name, dr = isnull(q.dr, 2147483647)
        from (
            select
                Name, rn = row_number() over (partition by Name order by (select 1))
            from
                #myTable
            where 
                Number is null
        ) t
        left join gaps_between_numbers q on t.Name = q.Name and t.rn = q.rn
    )
    , ordering_not_null_numbers as (
        select
            Name, Number, rn = dense_rank() over (partition by Name order by gr)
        from (
            select
                Name, Number, gr = sum(lg) over (partition by Name order by Number)
            from (
                select
                    Name, Number, lg = iif(Number - lag(Number) over (partition by Name order by Number) = 1, 0, 1)
                from
                    #myTable
                where
                    Number is not null
            ) t
        ) t
    )
    
    select
        Name, Number
        , Desidered_col = row_number() over (partition by Name order by rn, isnull(Number, 2147483647))
    from (
        select * from ordering_not_null_numbers
        union all
        select Name, null, dr from ordering_nulls   
    ) t
    

    CTE gaps_between_numbers is seeking for numbers that are not consecutive. Number difference between current and next row shows how many NULL values can be used to fill the gaps. Then master.dbo.spt_values is used to multiply each row by that amount. In gaps_between_numbers dr column is gap number and cnt is amount of NULL values that need to used.

    ordering_nulls orders only NULL values and is joined with CTE gaps_between_numbersto know in which position each row should appear.

    ordering_not_null_numbers orders values that are not NULL. Consecutive Numbers will have same row number

    And last step is to union CTE's ordering_not_null_numbers and ordering_nulls and make desired ordering

    Rextester DEMO