Search code examples
pythonpandasdataframenumpyshift

How can I subtract values of one column rows from another column row which is preceding on the basis of Year period?


I have a table which has various columns but I will only mention those which are on focus of interest.

Year Current Amount Previous Amount Type
2021 22000 36000 C5 70
2022 26000 48000 C4 68
2023 76440 34000 C5 70
2024 21399 12000 C4 68
2024 22000 11000 C4 34
2024 26500 13000 C4 23
2025 16000 34000 C4 68
2025 76000 22000 C4 34
2025 34000 64000 C4 23

Now obviously there are way to many year values. What I want is to deduct the current amount value when year is suppose 2022 from previous amount when year is 2021, In simple words, Subtract Current amount from Previous amount where every year value for Previous amount is preceding than the default value. For instance I want the results to look like this.

Year Current Amount Previous Amount Type Change
2021 22000 36000 C5 70 Nan
2022 26000 48000 C4 68 -10,000
2023 76440 34000 C5 70 28,440
2024 21399 12000 C4 68 35899
2024 22000 11000 C4 34 35899
2024 26500 13000 C4 23 35899
2025 16000 34000 C4 68 90000
2025 76000 22000 C4 34 90000
2025 34000 64000 C4 23 90000

I have tried using following code but no luck so far :

Df['Current Amount'] - Df['Previous Amount'].where(Df['Year']==Df['Year']-1)

Also:

Df['Current Amount'] - Df['Previous Amount'].where(Df['Year']==Df['Year'].shift(-1))

Please help me. Thanks in Advance.


Solution

  • duplicated data

    # aggregate data
    df2 = df.groupby('Year')[['Current Amount', 'Previous Amount']].sum()
    # ensure having all intermediate years
    df2 = df2.reindex(range(df2.index.min(), df2.index.max()+1))
    
    # compute the change and merge
    df.merge(df2['Current Amount'].sub(df2['Previous Amount'].shift()).rename('Change'),
             on='Year', how='left')
    

    output:

       Year  Current Amount  Previous Amount   Type   Change
    0  2021           22000            36000  C5 70      NaN
    1  2022           26000            48000  C4 68 -10000.0
    2  2023           76440            34000  C5 70  28440.0
    3  2024           21399            12000  C4 68  35899.0
    4  2024           22000            11000  C4 34  35899.0
    5  2024           26500            13000  C4 23  35899.0
    6  2025           16000            34000  C4 68  90000.0
    7  2025           76000            22000  C4 34  90000.0
    8  2025           34000            64000  C4 23  90000.0
    

    older answer

    IIUC, assuming consecutive years, use:

    df['Change'] = df['Current Amount'].sub(df['Previous Amount'].shift())
    

    Alternatively, to ensure mapping the previous year, use:

    s = df.set_index('Year')['Previous Amount']
    s.index += 1
    
    df['Change'] = df['Current Amount'].sub(df['Year'].map(s))
    

    output:

       Year  Current Amount  Previous Amount   Type   Change
    0  2021           22000            36000  C5 70      NaN
    1  2022           26000            48000  C4 68 -10000.0
    2  2023           76440            34000  C5 70  28440.0
    3  2024           21399            12000  C4 68 -12601.0