Search code examples
sqlsql-serverwindow-functionsrow-number

Dealing with NULLS in ROW_NUMBER function


I have a task where I have to give the rank to NULLS with respect to it's other not null duplicate values.

I have a table named TABLEA with column as ID, DATE and Amount . There is some rules that filters few rows from TABLEA with column as ID1, DATE1 and Amount1 and final result after filter is inserted into TABLEB. Now I did Table A left Join Table B which gives me the result as given below .

  ID      Date        AMOUNT    ID1      Date1        AMOUNT1    
  ---------------------------------------------------------------
  1       2022-10-17    400       NULL     NULL         NULL     
  1       2022-10-18    300       1       2022-10-18    300       
  2       2022-10-19    432       2       2022-10-19    432       
  2       2022-10-20    100       NULL    NULL          NULL 
  2       2022-10-21    200       NULL    NULL          NULL        
  2       2022-10-22    234       2       2022-10-21    234        
  3       2022-10-22    213       3       2022-10-22    213        
  3       2022-10-22    112       3       2022-10-23    112       
  3       2022-10-22    218       3       NULL          NULL       
  4       2022-10-23    654       4       2022-10-18    654         
  5       2022-10-28    500       NULL    NULL          NULL        
  

I want the rank as displayed in Row Number column of below example. I am doing group by ID and order by Date.

  ID      Date        AMOUNT    ID1      Date1        AMOUNT1    Row Number
  ----------------------------------------------------------------------------
  1       2022-10-17    400       NULL     NULL         NULL        1
  1       2022-10-18    300       1       2022-10-18    300         1
  2       2022-10-19    432       2       2022-10-19    432         1
  2       2022-10-20    100       NULL    NULL          NULL        1
  2       2022-10-21    200       NULL    NULL          NULL        1
  2       2022-10-22    234       2       2022-10-21    234         2
  3       2022-10-22    213       3       2022-10-22    213         1
  3       2022-10-22    112       3       2022-10-23    112         2
  3       2022-10-22    218       3       NULL          NULL        2
  4       2022-10-23    654       4       2022-10-18    654         1
  5       2022-10-28    500       NULL    NULL          NULL        1

I tried using row_number/dense_rank/rank function but could not find the solution.

SQL Query tried :

Select a.ID , a.date , a.amount , b.id1, b.date1, b.amount1 from TableA a
, row_number() over (partition by b.id1 order by case when b.date1 is not null then b.date1 end) ROW Number 
left join TableB b
on a.id = b.id 

Logic to achieve :

For ID = 1 There are two rows for ID =1 , but ID1 , Date1 and Amount1 column is null for ID=1. So I need to populate Row Number = 1 for both the rows for ID= 1. For this, I did partition by ID and order by Date to achieve Row Number = 1 for both rows. [which is not yet done by my Sql query and I stuck to achieve this logic.]

For ID = 2 There are four rows so when I do partition by ID and Order by Date , I need to get Row Number = 1 for 1st row, Row Number = 1 for 2nd and 3rd row as ID1, Date1 and Amount1 is null in these two rows and Row Number = 2 for 4th row as there is no null values in ID1, Date1 and Amount1 column.

For ID = 3 There are three rows. so when I do partition by ID and Order by Date , I need to get Row Number = 1 for 1st row, Row Number = 2 for 2nd and 3rd row as ID1, Date1 and Amount1 is null for 3rd row.

For ID =4, it is simple, it will have row number =1 as it is non duplicate row.

For ID =5 , It will also have row number = 1 as there is the only row for this ID.

Also If for a ID, if two rows have same date and not null ID1, Date1, Amount1 column then they should have different Row Number.

DDL and DML for above example.

Create table Test
(
ID int,
Date date,
Amount int,
ID1 int,
Date1 date,
Amount1 int
)


Insert into Test (ID, Date, Amount, ID1, Date1, Amount1) values 
(1 ,'2022-10-17',400,NULL,NULL ,NULL ) ,    
(1 ,'2022-10-18',300, 1 ,'2022-10-18', 300 ),       
(2 ,'2022-10-19',432, 2 ,'2022-10-19', 432 ),  
(2 ,'2022-10-20',100,NULL, NULL        ,NULL), 
(2 ,'2022-10-21',200,NULL, NULL        ,NULL),        
(2 ,'2022-10-22',234,2   , '2022-10-21',234 ),       
(3 ,'2022-10-22',213,3   , '2022-10-22',213 ),       
(3 ,'2022-10-22',112,3   , '2022-10-23',112 ),      
(3 ,'2022-10-22',218,3   , NULL        ,NULL),       
(4 ,'2022-10-23',654,4   , '2022-10-18',654 ),        
(5 ,'2022-10-28',500,NULL, NULL        ,NULL)    

Solution

  • I think you can use a windowed SUM with a CASE expression and a COALESCE to accomplish this. It returns the desired results as shown.

    select *
        , coalesce(sum(case when Amount1 is not null then 1 end) over (partition by id order by [Date]),1)
    from Test
    order by id, [Date];