I have a pandas dataframe with floats and strings and few <NA>
and nan
. I am trying to locate all the <NA>
and convert them into nan
using the following function pd.to_numeric(....., errors='coerce')
. Also making sure that floats and strings remain untouched. Could you please help me with it?
Also when I df.to_dict('list')
I get few ''
and <NA>
.
Thank you.
Whether you have string <NA>
or pandas NA (pd.NA
), both should be converted to nan
using pd.to_numeric
:
df = pd.DataFrame({'col': [1, '1', 'a', float('nan'), pd.NA, '<NA>']})
pd.to_numeric(df['col'], errors='coerce')
Output:
0 1.0
1 1.0
2 NaN
3 NaN
4 NaN
5 NaN
Name: col, dtype: float64
If you want to replace specific items, keeping the strings, rather use:
df['col'].replace({'<NA>': float('nan'), pd.NA: float('nan')})
Output:
0 1
1 1
2 a
3 NaN
4 NaN
5 NaN
Name: col, dtype: object
If you need string representations of number as numbers, and other strings intact while removing <NA>
/pd.NA
:
out = (pd.to_numeric(df['col'], errors='coerce')
.fillna(df['col'].replace({'<NA>': float('nan'), pd.NA: float('nan')}))
)
Output:
0 1.0
1 1.0
2 a
3 NaN
4 NaN
5 NaN
Name: col, dtype: object