apply a function on a dataframe in a rolling fashion using values from two columns and previous rows

Let's say I have the following dataframe (which in reality is much bigger hence the method should be fast):

df = pd.DataFrame({"distance1": [101, 102, 103], "distance2":[12, 33, 44]})

    distance1  distance2
0   12           101
1   33           102
2   44           103

Now I want to apply following function on this dataframe

def distance(x):
    return np.sqrt(np.power(x.loc[n, "distance1"] - x.loc[n-1 ,"distance1"], 2) + np.power(x.loc[n, "distance2"] - x.loc[n-1 ,"distance2"], 2))

data["dist"] = data.apply(distance, axis=1)

Where essentially I would calculate the euclidian distance between the distance1 and distance2 and n is the current row, and n-1 is the previous row in the dataframe

Solution

You could do this the following way:

import numpy as np
import pandas as pd

# Example data
df = pd.DataFrame({"distance1": [101, 102, 103], "distance2":[12, 33, 44]})
df['dist'] = np.sqrt((df['distance1'] - df['distance1'].shift(1))**2 + (df['distance2'] - df['distance2'].shift(1))**2)

df.loc[0, 'dist'] = np.nan

print(df)

which would give you:

   distance1  distance2       dist
0        101         12        NaN
1        102         33  21.023796
2        103         44  11.045361