Search code examples
python-3.xpandasdataframemulti-index

How to drop VALUES column of pivot table dataframe


Need to drop a sub-column of multi-index data frame created from pivot table

Need to drop a sub-column only at specific columns(month) dynamically

I have a dataframe created from pivot table and need to drop a sub-column at specific columns dynamically...
if todays date is less than 15 i need to drop the sub-column Bill1 for all the months except Sep-19(current month)
if todays date is greater than 15, it should drop the sub-column Bill1 for all the months except Oct-19(next month)

data_frame1 = pd.pivot_table(data_frame, index=['PC', 'Geo', 'Comp'], values=['Bill1', 'Bill2'], columns=['Month'], fill_value=0)
data_frame1 = data_frame1.swaplevel(0,1, axis=1).sort_index(axis=1)
tuples = [(a.strftime('%b-%y'), b) if a!= 'All' else (a,b) for a,b in data_frame1.columns]
data_frame1.columns = pd.MultiIndex.from_tuples(tuples)

output:

              Sep-19             OCT-19        Nov-19
             Bill1 Bill2      Bill1 Bill2     Bill1 Bill2     
PC Geo Comp
A  Ind   OS   1     1.28        1    1.28      1    1.28

desired Output:
if todays date is less than 15

               Sep-19          OCT-19        Nov-19
              Bill1  Bill2       Bill2         Bill2     
PC Geo Comp
A  Ind   OS    1      1.28        1.28          1.28


if todays date is greater than 15

               Sep-19       OCT-19            Nov-19
                Bill2     Bill1  Bill2        Bill2     
PC Geo Comp
A  Ind   OS     1.28        1     1.28         1.28

Solution

  • Use:

    #convert first level for datetimes and to month periods
    level0 = pd.to_datetime(df.columns.get_level_values(0), format='%b-%y').to_period('m')
    #get second level
    level1 = df.columns.get_level_values(1)
    print (level0)
    PeriodIndex(['2019-09', '2019-09', '2019-10', '2019-10', '2019-11', '2019-11'],
                 dtype='period[M]', freq='M')
    
    print (level1)
    Index(['Bill1', 'Bill2', 'Bill1', 'Bill2', 'Bill1', 'Bill2'], dtype='object')
    
    #test for next 15 days
    #dat = pd.to_datetime('2019-09-20')
    #get today timestamp
    dat = pd.to_datetime('now')
    print (dat)
    
    #convert timestamp to period
    today_per = dat.to_period('m')
    
    #compare day and filter
    if dat.day < 15:
        df = df.loc[:, (level0 == today_per) | (level1 != 'Bill1')]
    else:
        #test with add 1 month to today period
        df = df.loc[:, (level0 == today_per + 1) | (level1 != 'Bill1')]
    print (df)
             Sep-19       Oct-19 Nov-19
              Bill1 Bill2  Bill2  Bill2
    A Ind OS      1  1.28   1.28   1.28
    

    Test next month:

    #convert first level for datetimes and to month periods
    level0 = pd.to_datetime(df.columns.get_level_values(0), format='%b-%y').to_period('m')
    #get second level
    level1 = df.columns.get_level_values(1)
    print (level0)
    PeriodIndex(['2019-09', '2019-09', '2019-10', '2019-10', '2019-11', '2019-11'],
                 dtype='period[M]', freq='M')
    
    print (level1)
    Index(['Bill1', 'Bill2', 'Bill1', 'Bill2', 'Bill1', 'Bill2'], dtype='object')
    
    #test for next 15 days
    dat = pd.to_datetime('2019-09-20')
    #get today timestamp
    #dat = pd.to_datetime('now')
    print (dat)
    
    #convert timestamp to period
    today_per = dat.to_period('m')
    
    #compare day and filter
    if dat.day < 15:
        df = df.loc[:, (level0 == today_per) | (level1 != 'Bill1')]
    else:
        #test with add 1 month to today period
        df = df.loc[:, (level0 == today_per + 1) | (level1 != 'Bill1')]
    print (df)
             Sep-19 Oct-19       Nov-19
              Bill2  Bill1 Bill2  Bill2
    A Ind OS   1.28      1  1.28   1.28