Search code examples
sqlsql-serverapache-drill

SQL ratio between rows


I have a SQL table with the following format:

+------------------------------------+
| function_id | event_type | counter |
+-------------+------------+---------+
| 1           | fail       | 1000    |
| 1           | started    | 5000    |
| 2           | fail       | 800     |
| 2           | started    | 4500    |
| ...         | ...        | ...     |
+-------------+------------+---------+

I want to run a query over this that will group the results by function_id, by giving a ratio of the number of 'fail' events vs the number of 'started' events, as well as maintaining the number of failures. I.e. I want to run a query that will give something that looks like the following:

+-------------------------------------+
| function_id | fail_ratio | failures |
+-------------+------------+----------+
| 1           | 20%        | 1000     |
| 2           | 17.78%     | 800      |
| ...         | ...        |          |
+-------------+------------+----------+

I've tried a few approaches but have been unsuccessful so far. I'm using Apache Drill SQL at the moment, as this data is being pulled from flat files.

Any help would be greatly appreciated! :)


Solution

  • This is all conditional aggregation:

    select function_id,
           sum(case when event_type = 'fail' then counter*1.0 end) / sum(case when event_type = 'started' then counter end) as fail_start_ratio,
           sum(case when event_type = 'fail' then counter end) as failures
    from t
    group by function_id