Search code examples
sqlpostgresqldatetimepostgresql-9.1generate-series

Count full months between two dates


I've been working on this for a few hours with no luck and have hit a wall. My data looks like this:

Date1          Date2
2012-05-06     2012-05-05
2012-03-20     2012-01-05

What I'm trying to do is add 1 to the count for every month between two dates. So my output would ideally look like this:

Year    Month    Sum
2012    2        1

In other words, it should check for "empty" months between two dates and add 1 to them.

This is the code I've worked out so far. It will basically count the number of months between the two dates and group them into months and years.

SELECT 
    EXTRACT(YEAR FROM Date2::date) as "Year", 
    EXTRACT(MONTH FROM Date2::date) as "Month",
    SUM(DATE_PART('year', Date1::date) - DATE_PART('year', Date2::date)) * 12 +
    (DATE_PART('month', Date1::date) - DATE_PART('month', Date2::date))
FROM
    test
GROUP BY 
    "Year", 
    "Month",
ORDER BY
    "Year" DESC, 
    "Month" DESC;

This is where I'm stuck - I don't know how to actually add 1 for each of the "empty" months.


Solution

  • Test setup

    With some sample rows (should be provided in the question):

    CREATE TABLE test (
       test_id serial PRIMARY KEY
     , date1   date NOT NULL
     , date2   date NOT NULL
    );
    
    INSERT INTO test(date1, date2)
    VALUES
       ('2012-03-20', '2012-01-05')  -- 2012-02 lies in between
     , ('2012-01-20', '2012-03-05')  -- 2012-02 (reversed)
     , ('2012-05-06', '2012-05-05')  -- nothing
     , ('2012-05-01', '2012-06-30')  -- still nothing
     , ('2012-08-20', '2012-11-05')  -- 2012-09 - 2012-10
     , ('2012-11-20', '2013-03-05')  -- 2012-12 - 2013-02
    ;
    

    Postgres 9.3 or newer

    Use a LATERAL join:

    SELECT to_char(mon, 'YYYY') AS year
         , to_char(mon, 'MM')   AS month
         , count(*) AS ct
    FROM  (
       SELECT date_trunc('mon',    least(date1, date2)::timestamp) + interval '1 mon' AS d1
            , date_trunc('mon', greatest(date1, date2)::timestamp) - interval '1 mon' AS d2
       FROM   test
       ) sub1
     , generate_series(d1, d2, interval '1 month') mon  -- implicit CROSS JOIN LATERAL
    WHERE  d2 >= d1 -- exclude ranges without gap right away
    GROUP  BY mon
    ORDER  BY mon;
    

    Postgres 9.2 or older

    No LATERAL, yet. Use a subquery instead:

    SELECT to_char(mon, 'YYYY') AS year
         , to_char(mon, 'MM')   AS month
         , count(*) AS ct
    FROM  (
       SELECT generate_series(d1, d2, interval '1 month') AS mon
       FROM  (
          SELECT date_trunc('mon',    least(date1, date2)::timestamp) + interval '1 mon' AS d1
               , date_trunc('mon', greatest(date1, date2)::timestamp) - interval '1 mon' AS d2
          FROM   test
          ) sub1
       WHERE  d2 >= d1 -- exclude ranges without gap right away
       ) sub2
    GROUP  BY mon
    ORDER  BY mon;
    

    Result

     year | month | ct
    ------+-------+----
     2012 |     2 |  2
     2012 |     9 |  1
     2012 |    10 |  1
     2012 |    12 |  1
     2013 |     1 |  1
     2013 |     2 |  1
    

    db<>fiddle here
    SQL Fiddle.

    Explanation

    You are looking for complete calendar months between the two dates.

    These queries work with any dates or timestamps in ascending or descending order and should perform well.

    The WHERE clause is optional, since generate_series() returns no row if start > end. But it should be a bit faster to exclude empty ranges a priori.

    The cast to timestamp makes it a bit cleaner and faster. Rationale: