Search code examples
sqlamazon-web-servicessubqueryprestoamazon-athena

Athena/SQL query to get the desired result


sample_input_table

user  name  action      date
 1    aaa    view      2020-09-03
 2    bbb    view      2020-09-02
 3    ccc    view      2020-08-28
 4    ddd    view      2020-08-25
 1    aaa    purchase  2020-09-09

I have a table with huge number of rows, the table looks like above.

question

  1. i want to print the rows which have purchase action and
  2. at the same time, the user who did purchase must have row with view action
  3. and at the same time, that view action will be in the date range of purchase_date(2020-09-09) and purchase_date - 7days(2020-09-02).

I want to achieve these 3 point in one sql query

sample_output

user  name  action      date
1    aaa    purchase  2020-09-09

if we see sample output from the sample input

  1. our end result have only purchase_events
  2. purchased_user had a row with view action
  3. and that view was there in the timeframe of 2020-09-09 and 2020-09-02(purchased_date, purchased_date - 7 days)

Can anyone suggest some solution for this?


Solution

  • You can use exists:

    select t.*
    from mytable t
    where t.action = 'purchase' and exists (
        select 1
        from mytable t1
        where 
            t1.user = t.user 
            and t1.action = 'view'
            and t1.date >= t.date - interval '7' day
            and t1.date < t.date
        )