Search code examples
sqlpostgresqlgreatest-n-per-grouppostgresql-9.5

Group by year and month and get min value for a month, with the date


This is what my table looks like

Column  |     Type     |
--------+--------------+
id      | integer      |
date    | date         |
open    | numeric(9,2) |
close   | numeric(9,2) |
low     | numeric(9,2) |
high    | numeric(9,2) |

I'd like to get the date of the lowest close value for a given month, for all the months.

This is what I've managed to get,

SELECT 
    temp_table.year, temp_table.month, MIN(close)
FROM
    (
        SELECT 
            date_trunc('year', date) as year, date_trunc('month', date) as month, date, close 
        FROM
            sensex_sensex
        GROUP BY 
            year, month, date, close
        ORDER BY 
            year, month
    ) AS temp_table
GROUP BY
    temp_table.year, temp_table.month
ORDER BY
    temp_table.year DESC, temp_table.month DESC;

This gives me the year month and lowest close value. But when I try to add the date I get all the rows instead of being grouped by year and month. How do I go about getting the result as

Year | Month | Date of Lowest Close in a Month | Lowest Close in a Month

Also additionally,

I'd also like to be able to find the least 5 close values in a month with their dates, again grouped by year and month.


Solution

  • demo:db<>fiddle

    Using the window function MIN selects the minimum value per frame (the month in your case).

    SELECT
        extract('year' FROM mydate) as year,
        extract('month' FROM mydate) as month,
        mydate, close
    FROM (
        SELECT
            mydate,
            close,
            MIN(close) OVER (PARTITION BY date_trunc('month', mydate)) as min_close
        FROM
            temp_table
    ) s
    WHERE close = min_close
    

    Instead of MIN you could use ROW_NUMBER. This helps you if want to select not only the one minimum but the two or five smallest numbers (n):

    SELECT
        extract('year' FROM mydate) as year,
        extract('month' FROM mydate) as month,
        mydate, close
    FROM (
        SELECT
            mydate,
            close,
            ROW_NUMBER() OVER (PARTITION BY date_trunc('month', mydate) ORDER BY close) as rn
        FROM
            temp_table
    ) s
    WHERE rn <= n              -- n is the number of values you get.
    

    Do you really need separate columns for year and month which can be easily calculated from the date?