I have two rows of data in a Pandas data frame and want to operate each column separately with a function that includes both values e.g.
import pandas as pd
df = pd.DataFrame({"x": [1, 2], "z": [2, 6], "i": [3, 12], "j": [4, 20], "y": [5, 30]})
x z i j y
0 1 2 3 4 5
1 2 6 12 20 30
The function is something like the row 2 val minus row 1 val, divided by the latter - for each column separately e.g.
(row2-row1)/row2
so I can get the following
0.5 0.667 0.75 0.8 0.833
Based on the following links
how to apply a user defined function column wise on grouped data in pandas
https://pythoninoffice.com/pandas-how-to-calculate-difference-between-rows
Groupby and apply a defined function - Pandas
I tried the following
df.apply(lambda x,y: (x + y)/y, axis=0)
This does not work as it expects y as an argument
df.diff()
This works but then it is not exactly the function I want.
Does anyone know how to achieve the result I expect?
After testing many things I found out that it was not required to include two variables in the Lambda function (x,y), but just one and treat that as a vector with all values in the column, so the following solved the issue
df.apply(lambda x: (x[1] - x[0]) / x[1], axis=0)
This avoids having a result with NaN in the first row.