Search code examples
t-sqlsequencegaps-in-data

Order rows in a sequence and fill gaps for missing rows


I got a problem regarding missing rows in a table that is giving me a headache.

As base data, I have the following table:

declare @table table
(
  id1   int,
  id2   int,
  ch    char(1) not null,
  val   int     
)

insert into @table values (1112, 121, 'A', 12)
insert into @table values (1351, 121, 'A', 13)
insert into @table values (1411, 121, 'B', 81)
insert into @table values (1312, 7, 'C', 107)
insert into @table values (1401, 2, 'A', 107)
insert into @table values (1454, 2, 'D', 107)
insert into @table values (1257, 6, 'A', 1)
insert into @table values (1269, 6, 'B', 12)
insert into @table values (1335, 6, 'C', 12)
insert into @table values (1341, 6, 'D', 5)
insert into @table values (1380, 6, 'A', 3)

The output should be ordered by id2 and follow a fixed sequence of ch, which should repeat until next id2 begins.

Sequence:

'A'
'B'
'C'
'D'

If the sequence or the pattern is interrupted, it should fill the missing rows with null, so that i get this result table:

id1     id2     ch     val
----------------------------    
1112    121     'A'    12
NULL    121     'B'    NULL
NULL    121     'C'    NULL
NULL    121     'D'    NULL
1351    121     'A'    13
1411    121     'B'    81
NULL    121     'C'    NULL
NULL    121     'D'    NULL
NULL    7       'A'    NULL
NULL    7       'B'    NULL
1312    7       'C'    107
NULL    7       'D'    NULL
1401    2       'A'    107
NULL    2       'B'    NULL
NULL    2       'C'    NULL
1454    2       'D'    107

and so on...

What I'm looking for is a way to do this without iterations.

I hope someone can help!

Thanks in advance!


Solution

  • A solution might be this:

    declare @table table (  id1   int,  id2   int,  ch    char(1) not null,  val   int     )
    insert into @table values (1112, 121, 'A', 12)
        ,(1351, 121, 'A', 13),(1411, 121, 'B', 81),(1312, 7, 'C', 107),(1401, 2, 'A', 107)
        ,(1454, 2, 'D', 107),(1257, 6, 'A', 1),(1269, 6, 'B', 12),(1335, 6, 'C', 12)
        ,(1341, 6, 'D', 5),(1380, 6, 'A', 3)
    ;with foo as
    (select 
        *
        ,row_number() over (partition by id2 order by id1) rwn
        ,ascii(isnull(lag(ch,1) over (partition by id2 order by id1),'A'))-ascii('A')  prev
        ,count(*) over (partition by id2,ch) nr
        ,ascii(ch)-ascii('A') cur
    from @table
    
    )
    ,bar as
    (
    select 
    *,case when cur<=prev and rwn>1 then 4 else 0 end + cur-prev step
    from foo
    )
    ,foobar as
    (
    select *,sum(step) over (partition by id2 order by id1 rows unbounded preceding) rownum
    from bar
    )
    ,iterations as
    (
    select id2,max(nr) nr from foo
    group by id2
    )
    ,blanks as
    (
    select
    id2,ch chnr,char(ch+ascii('A') )ch,ROW_NUMBER() over (partition by id2 order by c.nr,ch)-1 rownum,c.nr
    from iterations a
    inner join (values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) c(nr)
    on c.nr<=a.nr
    cross join (values (0),(1),(2),(3)) b(ch)
    )
    select
    b.id1,a.id2,a.ch,b.val
    from blanks a
    left join foobar b
    on a.id2=b.id2 and a.rownum=b.rownum 
    order by a.id2,a.rownum
    

    I first make the query "foo" which looks at the row number and gets the previous value for ch for each id2.

    "bar" then finds how many missing values there are between the rows. For instance If the previous was an A and the current is a c then there are 2. If the previous was an A and the current is an A, then there are 4!

    "foobar" then adds the steps, thus numbering the original rows, where they should be in the final output.

    "iterations" counts the number of times the "ABCD" rows should appear.

    "BLANKS" then is all the final rows, that is for each id2, it outputs all the "ABCD" rows that should be in the final output, and numbers them in rownum

    Finally I left join "foobar" with "BLANKS" on id2 and rownum. Thus we get the correct number of rows, and the places where there are values in the original is output.