Search code examples
sqlpostgresqldategroup-bymedian

How to calculate the median in Postgres?


I have created a basic database (picture attached) Database, I am trying to find the following:

"Median total amount spent per user in each calendar month"

I tried the following, but getting errors:

SELECT 
user_id,
AVG(total_per_user)
FROM (SELECT user_id,
        ROW_NUMBER() over (ORDER BY total_per_user DESC) AS desc_total,
        ROW_NUMBER() over (ORDER BY total_per_user ASC) AS asc_total
      FROM (SELECT EXTRACT(MONTH FROM created_at) AS calendar_month,
            user_id,    
            SUM(amount) AS total_per_user
            FROM transactions
            GROUP BY calendar_month, user_id) AS total_amount   
      ORDER BY user_id) AS a
WHERE asc_total IN (desc_total, desc_total+1, desc_total-1)
GROUP BY user_id
;

Solution

  • In Postgres, you could just use aggregate function percentile_cont():

    select 
        user_id,
        percentile_cont(0.5) within group(order by total_per_user) median_total_per_user
    from (
        select user_id, sum(amount) total_per_user
        from transactions
        group by date_trunc('month', created_at), user_id
    ) t
    group by user_id
    

    Note that date_trunc() is probably closer to what you want than extract(month from ...) - unless you do want to sum amounts of the same month for different years together, which is not how I understood your requirement.