Let's say I have the following dataframe (which in reality is much bigger hence the method should be fast):
df = pd.DataFrame({"distance1": [101, 102, 103], "distance2":[12, 33, 44]})
distance1 distance2
0 12 101
1 33 102
2 44 103
Now I want to apply following function on this dataframe
def distance(x):
return np.sqrt(np.power(x.loc[n, "distance1"] - x.loc[n-1 ,"distance1"], 2) + np.power(x.loc[n, "distance2"] - x.loc[n-1 ,"distance2"], 2))
data["dist"] = data.apply(distance, axis=1)
Where essentially I would calculate the euclidian distance between the distance1 and distance2 and n is the current row, and n-1 is the previous row in the dataframe
You could do this the following way:
import numpy as np
import pandas as pd
# Example data
df = pd.DataFrame({"distance1": [101, 102, 103], "distance2":[12, 33, 44]})
df['dist'] = np.sqrt((df['distance1'] - df['distance1'].shift(1))**2 + (df['distance2'] - df['distance2'].shift(1))**2)
df.loc[0, 'dist'] = np.nan
print(df)
which would give you:
distance1 distance2 dist
0 101 12 NaN
1 102 33 21.023796
2 103 44 11.045361