I have a dataframe where in one column, I've ended up with some values that are not merely "NaN" but an array of NaNs (ie, "[nan, nan, nan]")
I want to change those values to 0. If it were simply "nan" I would use:
df.fillna(0)
But that doesn't work in this instance.
For instance if:
df1 = pd.DataFrame({
'ID':[1,2,3,4,5,6],
'Version':[1,1,2,2,1,2],
'Cost':[17,np.nan,24,[np.nan, np.nan, np.nan],13,8]})
Using df1.fillna(0) yields:
ID Version Cost
0 1 1 17
1 2 1 0
2 3 2 24
3 4 2 [nan, nan, nan]
4 5 1 13
5 6 2 8
When I'd like to get the output:
ID Version Cost
0 1 1 17
1 2 1 0
2 3 2 24
3 4 2 0
4 5 1 13
5 6 2 8
In your case column Cost
is an object so you can first convert to numeric and then fillna
.
import pandas as pd
df = pd.DataFrame({"ID":list(range(1,7)),
"Version":[1,1,2,2,1,2],
"Cost": [17,0,24,['nan', 'nan', 'nan'], 13, 8]})
Where df.dtypes
ID int64
Version int64
Cost object
dtype: object
So you can convert this columns to_numeric
using errors='coerce'
which means that assign a np.nan if conversion is not possible.
df["Cost"] = pd.to_numeric(df["Cost"], errors='coerce')\
.fillna(0)
or if you prefer in two steps
df["Cost"] = pd.to_numeric(df["Cost"], errors='coerce')
df["Cost"] = df["Cost"].fillna(0)