Take this table as an example:
+----+------------+------+
| id | date | flag |
+----+------------+------+
| A | 01/01/2020 | 0 |
| A | 01/02/2020 | 0 |
| A | 01/03/2020 | 0 |
| A | 01/04/2020 | 1 |
| A | 01/05/2020 | 1 |
| B | 01/01/2020 | 0 |
| B | 01/02/2020 | 1 |
| B | 01/03/2020 | 1 |
| B | 01/04/2020 | 1 |
| B | 01/05/2020 | 1 |
+----+------------+------+
There is some flag that is either set to 0 or 1. I want to create a new column called day_flagged
that will contain the date that the flag first became a 1. For example, for id
A, that would be 01/04/2020
. For id
B, that would be 01/02/2020
.
This is what I currently have:
SELECT x.id,
x.date,
( CASE
WHEN prev_flag = 0
AND next_flag = 1
AND x.flag = 1 THEN 1
ELSE NULL
END ) AS flagged
FROM (SELECT id,
date,
flag,
Lag(flag)
OVER (
partition BY id
ORDER BY date ASC) AS prev_flag,
Lead(flag)
OVER (
partition BY id
ORDER BY date ASC) AS next_flag
FROM tableA) AS x;
The result of that is this:
+----+------------+---------+
| id | date | flagged |
+----+------------+---------+
| A | 01/01/2020 | null |
| A | 01/02/2020 | null |
| A | 01/03/2020 | null |
| A | 01/04/2020 | 1 |
| A | 01/05/2020 | null |
| B | 01/01/2020 | null |
| B | 01/02/2020 | 1 |
| B | 01/03/2020 | null |
| B | 01/04/2020 | null |
| B | 01/05/2020 | null |
+----+------------+---------+
I'm able to identify when the value of flag
for each id
first changed from 0 to 1 and store that in flagged
. How can I take the date
value corresponding to the rows where flagged
is 1, and insert that date into every row of the partition as day_flagged
?
Desired outcome:
+----+------------+------+-------------+
| id | date | flag | day_flagged |
+----+------------+------+-------------+
| A | 01/01/2020 | 0 | 01/04/2020 |
| A | 01/02/2020 | 0 | 01/04/2020 |
| A | 01/03/2020 | 0 | 01/04/2020 |
| A | 01/04/2020 | 1 | 01/04/2020 |
| A | 01/05/2020 | 1 | 01/04/2020 |
| B | 01/01/2020 | 0 | 01/02/2020 |
| B | 01/02/2020 | 1 | 01/02/2020 |
| B | 01/03/2020 | 1 | 01/02/2020 |
| B | 01/04/2020 | 1 | 01/02/2020 |
| B | 01/05/2020 | 1 | 01/02/2020 |
+----+------------+------+-------------+
DB Fiddle: https://www.db-fiddle.com/f/wJsTnvNkYELHqLjHRx1pie/4
I understand that you want the date of the first 1
for each id
.
If so, it seems like a conditional window min()
would fit your need:
select
t.*,
min(date) filter(where flag = 1) over(partition by id) day_flagged
from tableA t
| id | date | flag | day_flagged |
| --- | ---------- | ---- | ----------- |
| A | 01/01/2020 | 0 | 01/04/2020 |
| A | 01/02/2020 | 0 | 01/04/2020 |
| A | 01/03/2020 | 0 | 01/04/2020 |
| A | 01/04/2020 | 1 | 01/04/2020 |
| A | 01/05/2020 | 1 | 01/04/2020 |
| B | 01/01/2020 | 0 | 01/02/2020 |
| B | 01/02/2020 | 1 | 01/02/2020 |
| B | 01/03/2020 | 1 | 01/02/2020 |
| B | 01/04/2020 | 1 | 01/02/2020 |
| B | 01/05/2020 | 1 | 01/02/2020 |