I'm facing a problem i don't understand :
>>> Y.isnull().values.any()
False
>>> Y.where(Y == 0).isnull().values.any()
True
I don't understand how NaN values can appeared in the second result. Y has a dtype = int64
Any idea ? Thank you very much!
Check out the behavior of pandas.DataFrame.where
:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.where.html
It substitutes the values that don't meet the condition with something. By default that something is not boolean values, it's 1
and nan
. Which means by doing Y.where(Y == 0)
you're creating nan
values instead of all the values that weren't 0. Hence nan
values in the second line.