Drop dataframe rows with values that are an array of NaN

I have a dataframe where in one column, I've ended up with some values that are not merely "NaN" but an array of NaNs (ie, "[nan, nan, nan]")

I want to change those values to 0. If it were simply "nan" I would use:

df.fillna(0)

But that doesn't work in this instance.

For instance if:

df1 = pd.DataFrame({
         'ID':[1,2,3,4,5,6],
         'Version':[1,1,2,2,1,2],
         'Cost':[17,np.nan,24,[np.nan, np.nan, np.nan],13,8]})

Using df1.fillna(0) yields:

   ID  Version             Cost
0   1        1               17
1   2        1                0
2   3        2               24
3   4        2  [nan, nan, nan]
4   5        1               13
5   6        2                8

When I'd like to get the output:

   ID  Version             Cost
0   1        1               17
1   2        1                0
2   3        2               24
3   4        2                0
4   5        1               13
5   6        2                8

Solution

In your case column Cost is an object so you can first convert to numeric and then fillna.

import pandas as pd

df = pd.DataFrame({"ID":list(range(1,7)),
                   "Version":[1,1,2,2,1,2],
                   "Cost": [17,0,24,['nan', 'nan', 'nan'], 13, 8]})

Where df.dtypes

ID          int64
Version     int64
Cost       object
dtype: object

So you can convert this columns to_numeric using errors='coerce' which means that assign a np.nan if conversion is not possible.

df["Cost"] = pd.to_numeric(df["Cost"], errors='coerce')\
               .fillna(0)

or if you prefer in two steps

df["Cost"] = pd.to_numeric(df["Cost"], errors='coerce')
df["Cost"] = df["Cost"].fillna(0)