Search code examples
python-3.xpandasdataframecalculated-columns

How to calculate difference between rows in Pandas DataFrame?


In a dataframe I have 4 variables that are the X, Y, Z and W orientations of a robot. Each line represents a measurement with these four values.

x = [-0.75853, -0.75853, -0.75853, -0.75852]
y = [-0.63435, -0.63434, -0.63435, -0.63436]
z = [-0.10488, -0.10490, -0.10492, -0.10495]
w = [-0.10597, -0.10597, -0.10597, -0.10596]

df = pd.DataFrame([x, y, z, w], columns=['x', 'y', 'z', 'w'])

I wrote the function below that returns three differences between two quaternions:

from pyquaternion import Quaternion

def quaternion_distances(w1, x1, y1, z1, w2, x2, y2, z2):
    """ Create two Quaternions objects and calculate 3 distances between them """
    q1 = Quaternion(w1, x1, y1, z1)
    q2 = Quaternion(w2, x2, y2, z2)

    dist_by_signal  = Quaternion.absolute_distance(q1, q2)
    dist_geodesic   = Quaternion.distance(q1, q2)
    dist_sim_geodec = Quaternion.sym_distance(q1, q2)

    return dist_by_signal, dist_geodesic, dist_sim_geodec

This difference is calculated based on the values of the second line by the values of the first line. Thus, I cannot use the Pandas apply function.

I have already added three columns to the dataframe, so that I receive each of the values returned by the function:

df['dist_by_signal']  = 0
df['dist_geodesic']   = 0
df['dist_sim_geodec'] = 0

The problem is: how to apply the above function to each row and include the result in these new columns? Can you give me a suggestion?


Solution

  • Consider shift to create adjacent columns, w2, x2, y2, z2, of next row values then run rowwise apply which does require axis='columns' (not index):

    df[[col+'2' for col in list('wxyz')]] = df[['x', 'y', 'z', 'w']].shift(-1)
    
    def quaternion_distances(row):
    
        """ Create two Quaternions objects and calculate 3 distances between them """ 
        q1 = Quaternion(row['w'], row['x'], row['y'], row['z'])
        q2 = Quaternion(row['w2'], row['x2'], row['y2'], row['z2'])
    
        row['dist_by_signal']  = Quaternion.absolute_distance(q1, q2)
        row['dist_geodesic']   = Quaternion.distance(q1, q2)
        row['dist_sim_geodec'] = Quaternion.sym_distance(q1, q2)
    
        return row
    
    
    df = df.apply(quaternion_distances, axis='columns')
    
    print(df)