python dataframe multiple-columns assign

Assign new column in DataFrame based on if value is in a certain value range

I have two DataFrames as follows:

df_discount = pd.DataFrame(data={'Graduation' : np.arange(0,1000,100), 'Discount %' : np.arange(0,50,5)})
df_values = pd.DataFrame(data={'Sum' : [20,801,972,1061,1251]})

Now my goal is to get a new column df_values['New Sum'] for my df_values that applies the corresponding discount to df_values['Sum'] based on the value of df_discount['Graduation']. If the Sum is >= the Graduation the corresponding discount is applied.

Examples: Sum 801 should get a discount of 40% resulting in 480.6, Sum 1061 gets 45% resulting in 583.55.

I know I could write a funtion with if else conditions and the returning values. However, is there a better way to do this if you have very many different conditions?

Solution

You could try if pd.merge_asof() works for you:

df_discount = pd.DataFrame({
    'Graduation': np.arange(0, 1000, 100), 'Discount %': np.arange(0, 50, 5)
})
df_values = pd.DataFrame({'Sum': [20, 100, 101, 350, 801, 972, 1061, 1251]})

df_values = (
    pd.merge_asof(
        df_values, df_discount,
        left_on="Sum", right_on="Graduation",
        direction="backward"
    )
    .assign(New_Sum=lambda df: df["Sum"] * (1 - df["Discount %"] / 100))
    .drop(columns=["Graduation", "Discount %"])
)

Result (without the last .drop(columns=...) to see what's happening):

    Sum  Graduation  Discount %  New_Sum
0    20           0           0    20.00
1   100         100           5    95.00
2   101         100           5    95.95
3   350         300          15   297.50
4   801         800          40   480.60
5   972         900          45   534.60
6  1061         900          45   583.55
7  1251         900          45   688.05