Search code examples
sqlaggregatepresto

How to aggregate data for logic and show non-aggregated results in the output?


I have a table that looks like below:

person fruit date
A apple xxxx
B banana xxxx
C apple xxxx
A banana xxxx
C apple xxxx
B banana xxxx

I am interested in persons who have more than one banana in the data set. In this case, it would be person B. I understand how to achieve this by aggregating the data. However, if I want my result to be NOT agrregated and look something like below, what would be the best way?

person fruit date
B banana xxxx
B banana xxxx

Solution

  • You can achieve this by using window function in subquery (in this case with count_if aggregate function) and filtering the result:

    -- sample data
    WITH dataset(person, fruit, date) AS (
        VALUES ('A',    'apple' ,'xxxx'),
        ('B',   'banana'    ,'xxxx'),
        ('C',   'apple' ,'xxxx'),
        ('A',   'banana'    ,'xxxx'),
        ('C',   'apple' ,'xxxx'),
        ('B',   'banana'    ,'xxxx')
    )
    
    -- query
    select person, fruit, date
    from (select *,
              count_if(fruit = 'banana') over(partition by person) banana_count
          from dataset)
    where banana_count > 1
    

    Output:

    person fruit date
    B banana xxxx
    B banana xxxx