ROW YEAR PROD KEY DATE
1 2011 APPLE TIME 2011-11-18 00:00:00.000
2 2011 APPLE TIME 2011-11-19 00:00:00.000
3 2013 APPLE NULL 2011-11-18 00:00:00.000
4 2013 APPLE NULL 2011-11-19 00:00:00.000
5 2013 APPLE TIME 2014-04-08 00:00:00.000
6 2013 APPLE DIM 2014-04-09 00:00:00.000
7 2013 APPLE TIME 2014-11-10 10:50:14.113
8 2013 APPLE TIME 2014-11-12 10:46:04.947
9 2013 MELON JAK 2011-10-17 11:01:19.657
10 2013 MELON TIME 2014-11-18 11:19:35.547
11 2013 MELON NULL 2014-11-19 11:19:35.547
12 2013 MELON TIME 2014-11-21 10:32:36.017
13 2014 APPLE JAK 2003-04-10 00:00:00.000
14 2014 APPLE DIM 2003-04-11 00:00:00.000
15 2015 APPLE TIME 2002-09-27 00:00:00.000
16 2015 APPLE NULL 2004-09-28 00:00:00.000
ROW is not a column in the table. Is just to show which records i want.
The above data is partitionned by (YEAR, PROD) and ordered by DATE.
I need to keep all the rows except row 3 and 4 based on the following logic :
Each group has to start with records that have a KEY that is are not null
==> otherwise discard
In other words, i can have : not null, null, not null, null
But i cannot have : null, not null, null, not null
ROW YEAR PROD KEY DATE
1 2011 APPLE TIME 2011-11-18 00:00:00.000
2 2011 APPLE TIME 2011-11-19 00:00:00.000
5 2013 APPLE TIME 2014-04-08 00:00:00.000
6 2013 APPLE DIM 2014-04-09 00:00:00.000
7 2013 APPLE TIME 2014-11-10 10:50:14.113
8 2013 APPLE TIME 2014-11-12 10:46:04.947
9 2013 MELON JAK 2011-10-17 11:01:19.657
10 2013 MELON TIME 2014-11-18 11:19:35.547
11 2013 MELON TIME 2014-11-19 11:19:35.547
12 2013 MELON TIME 2014-11-21 10:32:36.017
13 2014 APPLE JAK 2003-04-10 00:00:00.000
14 2014 APPLE DIM 2003-04-11 00:00:00.000
15 2015 APPLE TIME 2002-09-27 00:00:00.000
16 2015 APPLE TIME 2004-09-28 00:00:00.000
I want to do that, so later i have always a non null key at the begginning of each group. In that way, i can later always use the former row to fill a subsequent records which have null value (in this example 11 and 16)
Any observation or suggestion would be much appreciated !
The following gets the output you desire. I am checking of the value of key column between rows unbounded preceeding and current row, and since NULL has the highest rank, if there are preceeding rows that are not null it would populate the field min_val with a NOT NULL column.
select * from (
select year,prod,key1,date1
,min(key1) over(partition by year,prod order by date1 asc) as min_val
from t
)x
where x.min_val is not null
+------+-------+------+-------------------------+---------+
| year | prod | key1 | date1 | min_val |
+------+-------+------+-------------------------+---------+
| 2011 | APPLE | TIME | 2011-11-18 00:00:00.000 | TIME |
| 2011 | APPLE | TIME | 2011-11-19 00:00:00.000 | TIME |
| 2013 | APPLE | TIME | 2014-04-08 00:00:00.000 | TIME |
| 2013 | APPLE | DIM | 2014-04-09 00:00:00.000 | DIM |
| 2013 | APPLE | TIME | 2014-11-10 10:50:14.113 | DIM |
| 2013 | APPLE | TIME | 2014-11-12 10:46:04.947 | DIM |
| 2013 | MELON | JAK | 2011-10-17 11:01:19.657 | JAK |
| 2013 | MELON | TIME | 2014-11-18 11:19:35.547 | JAK |
| 2013 | MELON | | 2014-11-19 11:19:35.547 | JAK |
| 2013 | MELON | TIME | 2014-11-21 10:32:36.017 | JAK |
| 2014 | APPLE | JAK | 2003-04-10 00:00:00.000 | JAK |
| 2014 | APPLE | DIM | 2003-04-11 00:00:00.000 | DIM |
| 2015 | APPLE | TIME | 2002-09-27 00:00:00.000 | TIME |
| 2015 | APPLE | | 2004-09-28 00:00:00.000 | TIME |
+------+-------+------+-------------------------+---------+
link https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=ae82f64802674aa60005b8e9f534a150