Search code examples
sqloracle-databasebucketrownum

Oracle SQL function or buckets for data filtering


SELECT 
     transaction
    ,date
    ,mail
    ,status
    ,ROW_NUMBER() OVER (PARTITION BY mail ORDER BY date) AS rownum
FROM table1

enter image description here

Having the above table and script I want to be able to filter the transactions on the basis of having first 3 rowids with status 'failed' to show rowid 4 if 'failed', having transactions with rowid 4,5,6 failed - show 7 if also failed etc. I was thinking about adding it to a pandas dataframe where to run a simple lambda function , but would really like to find a solution in SQL only.


Solution

  • You could use lead() and lag() to explicitly check:

    select t.*
    from (select t1.*,
                 lag(status, 3) over (partition by mail order by date) as status_3,
                 lag(status, 3) over (partition by mail order by date) as status_2,
                 lag(status, 3) over (partition by mail order by date) as status_1,
                 lead(status, 1) over (partition by mail order by date) as status_3n,
                 lead(status, 2) over (partition by mail order by date) as status_2n,
                 lead(status, 3) over (partition by mail order by date) as status_3n
          from t
         ) t
    where status = 'FAILED' and
          ( (status_3 = 'FAILED' and status_2 = 'FAILED' and status_1 = 'FAILED') or
            (status_2 = 'FAILED' and status_1 = 'FAILED' and status_1n = 'FAILED') or
            (status_1 = 'FAILED' and status_1n = 'FAILED' and status_2n = 'FAILED') or
            (status_1n = 'FAILED' and status_2n = 'FAILED and status_3n = 'FAILED')
          )
    

    This is a bit brute force, but I think the logic is quite clear.

    You could simplify the logic to:

    where regexp_like(status_3 || status_2 || status_1 || status || status_1n || status_2n || status3n,
                      'FAILED{4}'
                     )