Search code examples
pythonpandaspandas-resample

Convert quarterly target to monthly target using resample, but it only gives me targets from Jan until October. The last two months are missing


My goal is to convert a quarterly target to a monthly target.

Below is my code where I specify the number of meetings a sales person has per quarter. And using resample method I then convert the quarterly target to monthly target. However, the output only gives me the targets between Jan and Oct.

import pandas as pd

# INPUT
# create quarterly meeting targets per sales person
quarters =  ['2023-Q1','2023-Q2','2023-Q3','2023-Q4']
anne =      [40,40,40,40]
martijn =   [40,40,40,40]
kevin =     [40,40,40,40]
danny =     [40,40,40,40]
rick =      [40,40,40,40]
fiona =     [35,35,35,35]
df = pd.DataFrame({'quarter': quarters, 'anne': anne, 'martijn': martijn, 'danny': danny, 'kevin': kevin, 'rick': rick, 'fiona': fiona})

df_melted = df.melt(id_vars=['quarter'], var_name='sales person', value_name='meeting target') # flatten table
df_melted['meeting target'] = df_melted['meeting target'].astype(float)/3 # convert meeting target column to float
df_melted['quarter'] = pd.to_datetime(df_melted['quarter']) #convert quarter column to date type

df_melted = df_melted.set_index('quarter').groupby('sales person').resample('M')['meeting target'].ffill().round(2)
df_melted = df_melted.reset_index()
df_melted

Below is the output for one of the sales people, as you can see, the output only goes till 2023-10-31, instead of 2023-12-31.

output

Can someone tell me what I'm missing?


Solution

  • Here's a workaround from just playing around with different options. Apparently converting to a periodIndex instead of datetime works but I don't know why. I agree with you that it seems like weird behavior to not get all months back after resampling.

    import pandas as pd
    
    # INPUT
    # create quarterly meeting targets per sales person
    quarters =  ['2023-Q1','2023-Q2','2023-Q3','2023-Q4']
    anne =      [30,60,90,120]
    martijn =   [33,63,93,123]
    df = pd.DataFrame({'quarter': quarters, 'anne': anne, 'martijn': martijn})
    
    df_melted = df.melt(id_vars=['quarter'], var_name='sales person', value_name='meeting target') # flatten table
    df_melted['meeting target'] = df_melted['meeting target'].astype(float)/3 # convert meeting target column to float
    
    #NOTE: this is the one difference, making a PeriodIndex instead of a datetime index
    #df_melted['month'] = pd.to_datetime(df_melted['quarter']) #convert quarter column to date type
    df_melted['month'] = pd.PeriodIndex(df_melted['quarter'], freq='Q')
    
    out_df = (
        df_melted.set_index('month')
            .groupby('sales person')
            .resample('M', convention='start')['meeting target'] #setting convention='start' is also req'd
            .ffill()
            .round(2)
            .reset_index()
    )
    out_df