I would like to apply a function to each row of a pandas dataframe. Instead of the argument being variable across rows, it's the function itself that is different for each row depending on the values in its columns. Let's be more concrete:
import pandas as pd
from scipy.interpolate import interp1d
d = {'col1': [1, 2], 'col2': [2, 4], 'col3': [3, 6]}
df = pd.DataFrame(data=d)
col1 | col2 | col3 | |
---|---|---|---|
0 | 1 | 2 | 3 |
1 | 2 | 4 | 6 |
Now, what I would like to achieve is to extrapolate columns 1 to 3 row-wise. For the first row, this would be:
f_1 =interp1d(range(df.shape[1]), df.loc[0], fill_value='extrapolate')
with the extrapolated value f_1(df.shape[1]).item() = 4.0
.
So the column I would like to add would be:
col4 |
---|
4 |
8 |
I've tried something like following:
import numpy as np
def interp_row(row):
n = row.shape[1]
fun = interp1d(np.arange(n), row, fill_value='extrapolate')
return fun(n+1).item()
df['col4'] = df.apply(lambda row: interp_row(row))
Can I make this work?
You were almost there:
import pandas as pd
from scipy.interpolate import interp1d
import numpy as np
d = {'col1': [1, 2], 'col2': [2, 4], 'col3': [3, 6]}
df = pd.DataFrame(data=d)
def interp_row(row):
n = row.shape[0]
fun = interp1d(np.arange(n), row, fill_value='extrapolate')
return fun(n).item()
df['col4'] = df.apply(lambda row: interp_row(row), axis=1)
print(df)
which returns:
col1 col2 col3 col4
0 1 2 3 4.0
1 2 4 6 8.0