Below is a small dataset of transaction records, with ID, DATE of the month, dummy variable of Bad_Credit or not. I would like to pull out all the transactions after a bad credit start. The OUTPUT column indicate the correct result, which is row 1,2,3,5,6,8,10.
This is just an example, there could be thousands of rows. SQL, R, SPSS will all work. Thank you.
DATE | ID | Bad_CREDIT | OUTPUT |
---|---|---|---|
12 | A | 1 | 1 |
15 | A | 1 | 1 |
18 | A | 0 | 1 |
2 | B | 0 | 0 |
10 | B | 1 | 1 |
20 | B | 0 | 1 |
5 | C | 0 | 0 |
15 | C | 1 | 1 |
1 | D | 0 | 0 |
9 | E | 1 | 1 |
If I understand correctly, you can use window functions:
select t.*
from (select t.*,
min(case when bad_credit = 1 then date end) over (partition by id) as min_bd_date
from t
) t
where date >= min_bd_date;
You can also do this with a correlated subquery:
select t.*
from t
where t.date >= (select min(t2.date)
from t t2
where t2.id = t.id and
t2.bad_credit = 1
);