Search code examples
python-3.xpandasdataframemissing-datadrop

How to drop entire record if more than 90% of features have missing value in pandas


I have a pandas dataframe called df with 500 columns and 2 million records.

I am able to drop columns that contain more than 90% of missing values.

But how can I drop in pandas the entire record if 90% or more of the columns have missing values across the whole record?

I have seen a similar post for "R" but I am coding in python at the moment.


Solution

  • You can use df.dropna() and set the thresh parameter to the value that corresponds to 10% of your columns (the minimum number of non-NA values).

    df.dropna(axis=0, thresh=50, inplace=True)