Search code examples
sqlpostgresqldatetimecountsubquery

How to save the results of a select expression as a variable?


In the following Postgresql sql, is there a way to save mo.delivered_at - mo.created_at as a variable so I don't have to repeat myself?

SELECT 
    to_char(mo.created_at,'MM-YYYY') AS month,
    mo.sku_key as sku,
    c.name,
    COUNT(*) as total,
    COUNT(*) FILTER (WHERE mo.delivered_at - mo.created_at < interval '3 days') as three_days,
    COUNT(*) FILTER (WHERE mo.delivered_at - mo.created_at > interval '3 days' and mo.delivered_at - mo.created_at <= interval '6 days') as six_days,
    COUNT(*) FILTER (WHERE mo.delivered_at - mo.created_at > interval '6 days' and mo.delivered_at - mo.created_at <= interval '9 days') as nine_days,
    COUNT(*) FILTER (WHERE mo.delivered_at - mo.created_at > interval '9 days') as ten_days,
    min(mo.delivered_at - mo.created_at),
    max(mo.delivered_at - mo.created_at),
    percentile_disc(0.5) within group (order by mo.delivered_at - mo.created_at) as median,
    avg(mo.delivered_at - mo.created_at) as average
FROM medication_order mo
LEFT JOIN subscription s ON s.id=mo.subscription_id
LEFT JOIN condition c on s.condition_id = c.id
WHERE 
    mo.status = 'DELIVERED' AND 
    mo.payment_preference = 'INSURANCE' AND
    mo.created_at > '2020-01-01' AND
    mo.delivered_at IS NOT null AND
    mo.sku_key != 'manual_order_sku'
GROUP BY month, mo.sku_key, c.name

Solution

  • You can compute the derived value in a subquery or CTE as has been suggested.

    But there is more. This should be faster (and correct). And can be sorted properly, too:

    SELECT
        to_char(mo.month,'MM-YYYY') AS month,  -- optionally prettify
        mo.sku,
        s.condition_id,  -- I added this to make the result unambiguous
        (SELECT name FROM condition WHERE id = s.condition_id) AS condition_name,
        COUNT(*) AS total,
        COUNT(*) FILTER (WHERE mo.my_interval < interval '3 days') AS three_days,
        COUNT(*) FILTER (WHERE mo.my_interval > interval '3 days' AND mo.my_interval <= interval '6 days') AS six_days,
        COUNT(*) FILTER (WHERE mo.my_interval > interval '6 days' AND mo.my_interval <= interval '9 days') AS nine_days,
        COUNT(*) FILTER (WHERE mo.my_interval > interval '9 days') AS ten_days,
        min(mo.my_interval),
        max(mo.my_interval),
        percentile_disc(0.5) WITHIN GROUP (ORDER BY mo.my_interval) AS median,
        avg(mo.my_interval) AS average
    FROM  (
       SELECT
           date_trunc('month', mo.created_at) AS month,         -- faster, keeps ORDER
           delivered_at - created_at          AS my_interval,   -- your core request
           sku_key                            AS sku
       FROM   medication_order mo
       WHERE  status = 'DELIVERED'                              -- filter early
       AND    payment_preference = 'INSURANCE'
       AND    created_at > '2020-01-01'
       AND    delivered_at IS NOT NULL
       AND    sku_key <> 'manual_order_sku'
       ) mo
    LEFT   JOIN subscription s ON s.id = mo.subscription_id
    GROUP  BY mo.month, mo.sku, s.condition_id  -- GROUP BY unique ID! Correct - and cheaper, too
    ORDER  BY mo.month, mo.sku, s.condition_id; -- my addition: sorting by date works across years, 'MM-YYYY' does not
    

    Aside: condition.name should probably be UNIQUE. And "name" is almost never a good name.