Search code examples
sqlarrayspostgresqlmedian

Calculate median for each element of a postgresql array


I have a postgresql table "samples" containing a column "intensity" that consists of an array of 200 integers. I want to calculate the median of a bunch of samples, for each element of the array, eg the median of intensity[1], the median of intensity[2], ..., the median of intensity[200]. If the column consisted of a single integer, then this would be easy using the percentile_disc function:

select percentile_disc(0.5) within group (order by intensity) from samples where...

However, since my column consists of an array, this does not work. Running the code gives a correct answer for intensity [1], but all other values are wrong (I suspect the code simply extracted the whole 200 integers from the sample that had the median of intensity[1]).

I could probably use a "for" loop to get the data but there must be a more elegant solution using unnest or something of that kind...


Solution

  • You need to unnest the array and keep track of each element. That would be something like this:

    select n, percentile_cont(0.5) within group (order by el) as median
    from samples s cross join lateral
         unnest(s.ar) with ordinality u(el, n)
    group by n;
    

    You can re-aggregate this into an array if you like.