Search code examples
pythonpandasdatetimeindexingmulti-index

Create timeseries data - Pandas


I have a multi-index dataframe of timeseries data which looks like the following;

      A  B  C 
1 1   21 32 4
  2   4  2  23
  3   12 9  10
  4   1  56 37
.
.
.
.
  30  63 1  27
  31  32 2  32
.
.
.
12 1  2  3  23
   2  23 1  12
   3  32 3  23
.
.
.
   31 23 2 32

It is essentially a multi-index of month and dates with three columns.

I need to turn this into daily data and essentially have a dataframe whereby there is a single index where value in the above dataframe responds to its' respective date over 10 years.

For exmaple;

Desired output;
            A  B  C
01/01/2017  21 32 4
.
.
31/12/2017  23 2 32
.
.
01/01/2022  21 32 4
.
.
31/12/2022  23 2 32

I hope this is clear! Its essentially turning daily/monthly data into daily/monthly/yearly data.


Solution

  • You can use:

    df.index = pd.to_datetime(df.index.rename(['month', 'day']).to_frame().assign(year=2022))
    

    Output:

                 A   B   C
    2022-01-01  21  32   4
    2022-01-02   4   2  23
    2022-01-03  12   9  10
    2022-01-04   1  56  37
    2022-01-30  63   1  27
    2022-01-31  32   2  32
    2022-12-01   2   3  23
    2022-12-02  23   1  12
    2022-12-03  32   3  23
    2022-12-31  23   2  32
    

    spanning several years

    There is no absolute fool proof way to handle years if those are missing. What we can do it to infer the year change when a date goes back in the past and add 1 year in this case:

    # let's assume the starting year is 2017
    date = pd.to_datetime(df.index.rename(['month', 'day']).to_frame().assign(year=2017))
    
    df.index = date + date.diff().lt('0').cumsum().mul(pd.DateOffset(years=1))
    

    output:

                 A   B   C
    2017-01-01  21  32   4
    2017-01-02   4   2  23
    2017-06-03  12   9  10
    2017-06-04   1  56  37
    2018-01-30  63   1  27 # added 1 year
    2018-01-31  32   2  32
    2018-12-01   2   3  23
    2018-12-02  23   1  12
    2018-12-03  32   3  23
    2018-12-31  23   2  32
    

    used input:

            A   B   C
    1  1   21  32   4
       2    4   2  23
    6  3   12   9  10
       4    1  56  37
    1  30  63   1  27 # here we go back from month 1 after month 6
       31  32   2  32
    12 1    2   3  23
       2   23   1  12
       3   32   3  23
       31  23   2  32