Search code examples
pandascalculationvariance

Calculate Simple Yearly Variability Pandas


i'm struggling on how to calculate the yearly variability of a panda df (df1) with yearly data and given a long-term yearly average in df2 for different features. The calculation is simply (df1/df2)-1 * 100.

Suppose the data in df1 looks like this:

df1:
Site     Year     Value
Red      2019     50
Red      2020     40
Red      2021     60
White    2019     30
White    2020     70
White    2021     80
Blue     2019     10
Blue     2020     15
Blue     2021     20

df2:
Site    LT Value
Red     50
White   60
Blue    15

Answer like this...:
df1:
Site    Year    Value   Var
Red     2019    50     0.000
Red     2020    40    -0.200
Red     2021    60     0.200
White   2019    30    -0.500
White   2020    70     0.167
White   2021    80     0.333
Blue    2019    10    -0.333
Blue    2020    15     0.000
Blue    2021    20     0.333

It seems like something like this should work:

df1['Var'] = (df1['Value'] / df1['Site'].map(df2.set_index('Site'),['LT Value']) -1 ) * 100

But, I get a ValueError:

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

thank you,


Solution

  • You have a typo in your code

    df1['Var'] = (df1['Value'] / df1['Site'].map(df2.set_index('Site'),['LT Value']) -1 ) * 100
    

    should be

    df1['Var'] = (df1['Value'] / df1['Site'].map(df2.set_index('Site')['LT Value']) -1 ) * 100
    

    without the comma between df2.set_index('Site') and ['LT Value']