i have the following dataframe
import pandas as pd
d = {'Cell':['A','B','C'],'D1':[5, 2, 3], 'D2':['NaN', 5, 6], 'D3':[7,'NaN', 9], 'D6':[17, 3, 'NaN']}
df = pd.DataFrame(d)
print(df)
Cell D1 D2 D3 D6
0 A 5 NaN 7 17
1 B 2 5 NaN 3
2 C 3 6 9 NaN
i want to check whether D6>D3>D2>D1 by ignoring the NAN values and create a new column 'is_monotonic_increasing' with true or false
d1 = {'Cell':['A','B','C'],'D1':[5, 2, 3], 'D2':['NaN', 5, 6], 'D3':[7,'NaN', 9], 'D6':[17, 3, 'NaN'],'is_monotonic_increasing?':['True','False','True']}
df1 = pd.DataFrame(d1)
print(df1)
Cell D1 D2 D3 D6 is_monotonic_increasing?
0 A 5 NaN 7 17 True
1 B 2 5 NaN 3 False
2 C 3 6 9 NaN True
column names that i want to check are dynamic and they can be fetched as list in the program column_names=[D1,D2,D3,D6]
i tried this
df['is_monotonic_increasing?']=df.apply(lambda x:x.is_monotonic_increasing,axis=1)
but it is always producing false as output due to NaN.
Cell D1 D2 D3 D6 is_monotonic_increasing?
0 A 5 NaN 7 17 False
1 B 2 5 NaN 3 False
2 C 3 6 9 NaN False
Please let me know any solution to this?
You can fill forward values to compute the diff along the column axis:
df['inc'] = (df.filter(regex='^D\d+').ffill(axis=1).bfill(axis=1)
.diff(axis=1).iloc[:, 1:].ge(0).all(axis=1))
print(df)
# Output
Cell D1 D2 D3 D6 inc
0 A 5.0 NaN 7.0 17.0 True
1 B 2.0 5.0 NaN 3.0 False
2 C 3.0 6.0 9.0 NaN True