I'm trying to do the same thing as in this question, but I have have a string-type column that I need to keep in the dataframe so I can identify which rows are which. (I guess I could do this by index, but I'd like to be able to save a step.) Is there a way to not count a column when using .any(), but keep it in the resulting dataframe? Thanks!
Here's the code that words on all columns:
df[(df > threshold).any(axis=1)]
Here's the hard coded version I'm working with right now:
df[(df[list_of__selected_columns] > 3).any(axis=1)]
This seems a little clumsy to me, so I'm wondering if there's a better way.
You can use .select_dtype
to choose all, say numerical columns:
df[df.select_dtype(include='number').gt(threshold).any(axis=1)]
Or a chunk of continuous columns with iloc
:
df[df.iloc[:,3:6].gt(threshold).any(axis=1)]
If you want to select some random list of columns, you'd be best to resolve by hard coded list.