I have the following dataframe:
in_scenario_USA USA index_in
month year
4 1960 NaN 0
5 1960 NaN 0
6 1960 0.000000 1.0
7 1960 0.000000 1.0
8 1960 0.000000 1.0
9 1960 0.000000 1.0
10 1960 0.000000 1.0
11 1960 0.000000 1.0
12 1960 0.000000 1.0
1 1961 0.000000 1.0
2 1961 0.025536 1.0
3 1961 0.003843 1.0
4 1961 0.019139 1.0
5 1961 0.000000 1.0
The first column is the percentage return of one month. Based on my scenarios, it is either 0 or a number. I would like my second column to have its first non null value to remain 1 and then each subsequent value can be explained with the following formula:
USA index_in[i] = USA index_in[i-1] * (1 + in_scenario_USA)[i]
so in the end it would look like this:
in_scenario_USA USA index_in
month year
4 1960 NaN 0
5 1960 NaN 0
6 1960 0.000000 1.0
7 1960 0.000000 1.0
8 1960 0.000000 1.0
9 1960 0.500000 1.5
10 1960 0.500000 2.25
11 1960 0.000000 2.25
12 1960 0.000000 2.25
1 1961 0.000000 2.25
2 1961 -0.200000 1.8
3 1961 0.100000 1.98
4 1961 0.100000 2.178
5 1961 0.000000 2.178
I've tried many loops but the one I feel the closest to be accurate is this one but I only get NaN values in the end.
for i in range(0, len(df_merged[col + ' index_in'])):
if df_merged[col + ' index_in'].iloc[i] == 1 and (df_merged[col + ' index_in'].iloc[-i] == 0):
continue
else:
df_merged[col + ' index_in'].iloc[i] = np.multiply(df_merged[col + ' index_in'].iloc[i-1], df_merged['in_scenario_' + col].iloc[i])
Thanks for your help.
I think your problem is the NaN
values in your first couple rows, that get multiplied through and result in adding NaN values. Try this:
newcol = []
firstnonnan = True
for index, row in df.iterrows():
if row['in_scenario_USA'].isnull():
newcol.append(row['USA_index_in'])
elif firstnonnan = True:
newcol.append(1)
firstnonnan = False
else:
newcol.append(newcol[-1]*(1+row['in_scenario_USA']))
df['USA index_in'] = newcol
Basically, you're going to iterate through the dataframe and build a list to overwrite your column with. If you have a NaN
value, you're going to take the same value that's already in that column. The first time you see a non-NaN
, you'll add a 1 to your list. After that, you'll use the last item in the list as your previous value.