Good morning all,
I have a pandas
dataframe containing multiple series. For a given series within the dataframe, the datatypes are unicode, NaN, and int/float. I want to determine the number of NaNs in the series but cannot use the built in numpy.isnan
method because it cannot safely cast unicode data into a format it can interpret. I have proposed a work around, but I'm wondering if there is a better/more Pythonic way of accomplishing this task.
Thanks in advance, Myles
import pandas as pd
import numpy as np
test = pd.Series(data = [NaN, 2, u'string'])
np.isnan(test).sum()
#Error
#Work around
test2 = [x for x in test if not(isinstance(x, unicode))]
numNaNs = np.isnan(test2).sum()
Use pandas.isnull:
In [24]: test = pd.Series(data = [NaN, 2, u'string'])
In [25]: pd.isnull(test)
Out[25]:
0 True
1 False
2 False
dtype: bool
Note however, that pd.isnull
also regards None
as True
:
In [28]: pd.isnull([NaN, 2, u'string', None])
Out[28]: array([ True, False, False, True], dtype=bool)