Each row in this dataframe represents an order and executionStatus.x has some info about the order status.
Those executionStatus.x columns are automatically created by flatten_json by amirziai, depending on how many arguments there are. So if there are 3 statuses for one order like in row 0, there will be up to executionStatus.2. Since row 1 and 2 only have one status, it only has values in executionStatus.0.
My problem is I cannot match "ORDER_FULFILLED" because I don't know how many executionStatuses there will be and I would need to write the exact column name like so df[df['executionStatus.0'].str.match('ORDER_FULFILLED')].
executionStatus.0 executionStatus.1 executionStatus.2 \
0 REQUESTED_AMOUNT_ROUNDED MEOW ORDER_FULFILLED
1 ORDER_FULFILLED NaN NaN
2 NOT_AN_INFUNDING_LOAN NaN NaN
investedAmount loanId requestedAmount OrderInstructId
0 50.0 22222 55.0 55555
1 25.0 33333 25.0 55555
2 0.0 44444 25.0 55555
Is there a way to get the entire row or index that matched with "ORDER_FULFILLED" element in the entire dataframe?
Ideally, the matched dataframe should look like this because row 0 and row 1 have ORDER_FULFILLED in the executionStatuses and row 3 does not so it should be excluded. Thanks!
investedAmount loanId requestedAmount OrderInstructId
0 50.0 22222 55.0 55555
1 25.0 33333 25.0 55555
Use df.filter()
for getting the similar columns containing executionStatus
with a boolean mask:
df[df.filter(like='executionStatus').eq('ORDER_FULFILLED').any(axis=1)]
executionStatus.0 executionStatus.1 executionStatus.2 \
0 REQUESTED_AMOUNT_ROUNDED MEOW ORDER_FULFILLED
1 ORDER_FULFILLED NaN NaN
investedAmount loanId requestedAmount OrderInstructId
0 50 22222 55 55555
1 25 33333 25 55555
If you want to delete he execution
columns from output, use:
df.loc[df.filter(like='executionStatus').eq('ORDER_FULFILLED').any(axis=1),\
df.columns.difference(df.filter(like='executionStatus').columns)