python pandas linear-regression polynomials

Linear Regression - re-scaled to not go over max value

I have a polynomial regression model that outputs the predicted values ('predicted_rev_running_total') in a data frame, which is supposed to be a running total along a project timeline that's from 0 to 1. I reordered the 'predicted_rev_running_total' from smallest to largest. My dilemma now is how to scale it so that is resembles something like the 'new_predicted_rev_running_total' column.

Solution

I think this is what you want. It's a simple two step process:

create a normalized value of the predicted column (just divide by max per group)
multiply the normalized value by the contract value

# first create a normalized value of predicted column
df['normalized_predicted'] = df.groupby("project_timeline")["predicted_rev_running_total"].apply(lambda x: x/x.max())

# then, multiply it by the bill contract
df['new_predicted_rev_running_total'] = df.apply(lambda row: (row['normalized_predicted']*row['Guaranteed Bill Contract Amt (Max Value)']), axis=1)