Search code examples
pythonpandasdategroup-by

Pandas - add rows in a grouped dataframe with varying values


I want to add rows per group. In each new row a date must be updated according to a interval specific to each group. Example: n_times: number of rows for corresponding group, interval_days: days to be added by row for the corresponding group

Id n_times interval_days  date (MM-dd)
1  3       4              03-01  
2  1       4              03-02
3  2       5              03-05
4  4       3              03-07

Desired:

Id n_times interval_days  date (MM-dd)
1  3       4              03-01  
1  3       4              03-05 
1  3       4              03-09 
2  1       4              03-02
3  2       5              03-05
3  2       5              03-10
4  4       3              03-07
4  4       3              03-10
4  4       3              03-13
4  4       3              03-16

Solution

  • try this:

    def generate_date_range(n_times: int, interval_days: int, start_date: str, year: str = '2024'):
        return pd.date_range(
            start=f'{year}-{start_date}', periods=n_times, freq=f'{interval_days}D'
        )
    
    
    result = df.set_index('Id')
    result['date (MM-dd)'] = result.apply(lambda x: generate_date_range(*x), axis=1)
    result = result.explode('date (MM-dd)')
    result['date (MM-dd)'] = result['date (MM-dd)'].dt.strftime('%m-%d')
    print(result)
    >>>
        n_times  interval_days date (MM-dd)
    Id                                     
    1         3              4        03-01
    1         3              4        03-05
    1         3              4        03-09
    2         1              4        03-02
    3         2              5        03-05
    3         2              5        03-10
    4         4              3        03-07
    4         4              3        03-10
    4         4              3        03-13
    4         4              3        03-16