Search code examples
sqldatabase-performancenetezzacorrelated-subquery

What is the most performant way to rewrite a correlated subquery in the SELECT clause?


I am trying to count whether a user has visited a site in three time ranges:

  • last 30 days
  • between 31 and 60 days
  • between 61 and 90 days

I am using Netezza, which does NOT support correlated subqueries in the SELECT clause. See Rextester for successful query that must be re-written to NOT use a correlated subquery: http://rextester.com/JGR62033

Sample Data:

| user_id | last_visit | num_days_since_2017117 |
|---------|------------|------------------------|
| 1234    | 2017-11-02 | 15.6                   |
| 1234    | 2017-09-30 | 48.6                   |
| 1234    | 2017-09-03 | 75.0                   |
| 1234    | 2017-08-21 | 88.0                   |
| 9876    | 2017-10-03 | 45.0                   |
| 9876    | 2017-07-20 | 120.0                  |
| 5545    | 2017-09-15 | 63.0                   |

Desired Output:

| user_id | last_30 | btwn_31_60 | btwn_61_90 |
|---------|---------|------------|------------|
| 1234    | 1       | 1          | 1          |
| 5545    | 0       | 0          | 1          |
| 9876    | 0       | 1          | 0          |

Solution

  • Here is one way with conditional aggregation, Rextester:

    select 
        user_id
        ,MAX(case when '2017-11-17'-visit_date <=30
              then 1
              else 0
         end) as last_30
        ,MAX(case when '2017-11-17'-visit_date >=31
                   and '2017-11-17'-visit_date <=60
              then 1
              else 0
         end) as between_31_60
        ,MAX(case when '2017-11-17'-visit_date >=61
                  and '2017-11-17'-visit_date <=90
              then 1
              else 0
         end) as between_61_90
    from 
        visits
    group by user_id
    order by user_id