Search code examples
sqlpostgresqlgroup-byaveragedate-difference

Calculating the average of different investment rounds for different companies in PostgreSQL


I have a table containing the company_id, investments rounds name(like round A, B, C, or IPO ......), and the date(like 2001-05-07) of each investment round for each company. I want to calculate the average gaps for different investment rounds for all companies. For example, what is the average time all companies take from A to B? What is the average time all companies take from B to C? What is the average time all companies take from C to D? The table looks like this:

|company_id| |invest_rounds_type_name| |invest_date|
---------------------------------------------------
1             A                         2001-01-01
---------------------------------------------------
1             B                         2001-12-05
---------------------------------------------------
1             C                         2003-11-12
---------------------------------------------------
2             A                         1963-03-01
---------------------------------------------------
2             B                         1967-10-10
---------------------------------------------------
2             C                         1970-10-12
---------------------------------------------------
2             D                         1971-01-05
---------------------------------------------------
3             B                         2017-11-20
---------------------------------------------------
3             A                         2017-11-16
---------------------------------------------------
3             C                         2018-03-19
---------------------------------------------------

Thanks for any help!


Solution

  • step-by-step demo:db<>fiddle

    SELECT
        invest_round as invest_round_start,
        invest_round_end,
        AVG(days_required)
    FROM (
        SELECT
            *,
            lead(invest_round) OVER w as invest_round_end,          
            lead(invest_date) OVER w - invest_date as days_required
        FROM mytable
        WINDOW w AS (PARTITION BY company_id ORDER BY invest_round)
    ) s
    WHERE invest_round_end IS NOT NULL
    GROUP BY invest_round, invest_round_end
    ORDER BY invest_round
    

    By using the lead() window function you can copy the next value of a specific column onto your current. So you can get the following invest_round to the current records as well as the following invest_date.

    With the following date and the current one you are able to calculate the duration between both invest_rounds.

    Now you just have to group by the invest_rounds and calculate the AVG aggregate.