Search code examples
sqlpostgresqlgroup-bygreatest-n-per-group

SQL group by with Case statement


I have a table that looks something like this

+-----------+-----------+---------------------+
| Report_id |  Status   |        Date         |
+-----------+-----------+---------------------+
|         1 | Completed | 2020-06-07 12:20:00 |
|         1 | Completed | 2020-06-07 12:22:00 |
|         2 | Running   | 2020-06-07 13:02:00 |
|         2 | Completed | 2020-06-07 13:10:00 |
|         3 | Completed | 2020-06-07 14:10:00 |
|         3 | Failed    | 2020-06-07 14:04:00 |
+-----------+-----------+---------------------+

I need to group this data by Report_id. So if all status values within groups are equal to Completed than Status is Completed and Date is the max value within group. But if there is one Running or Failed value within group than Status needs to be equal to Running and Failed respectively and Date should match this value.

The output will look like this.

+-----------+-----------+---------------------+
| Report_id |  Status   |        Date         |
+-----------+-----------+---------------------+
|         1 | Completed | 2020-06-07 12:22:00 |
|         2 | Running   | 2020-06-07 13:02:00 |
|         3 | Failed    | 2020-06-07 14:04:00 |
+-----------+-----------+---------------------+

I suspect I need to use Case statement somewhere to get this output but I can't figure out how. Please help.


Solution

  • You can use STRING_AGG to combine the text of the status columns into one string and check if that string contains 'running' or 'failed'

    select report_id, 
    case when STRING_AGG(status, '') like '%Running%' then 'Running'
    when STRING_AGG(status, '') like '%Failed%' then 'Failed'
    else 'Completed' end Status,
    max(date) Date
    from abc
    group by report_id