Search code examples
pythonpandasdataframeis-empty

Empty DataFrame doesn't admit its empty


I must not understand something about emptiness when it comes to pandas DataFrames. I have a DF with empty rows but when I isolate one of these rows its not empty.

Here I've made a dataframe:

>>> df = pandas.DataFrame(columns=[1,2,3], data=[[1,2,3],[1,None,3],[None, None, None],[3,2,1],[4,5,6],[None,None,None],[None,None,None]])
>>> df
     1    2    3
0  1.0  2.0  3.0
1  1.0  NaN  3.0
2  NaN  NaN  NaN
3  3.0  2.0  1.0
4  4.0  5.0  6.0
5  NaN  NaN  NaN
6  NaN  NaN  NaN

Then I know row '2' is full of nothing so I check for that...

>>> df[2:3].empty
    False

Odd. So I split it out into its own dataframe:

>>> df1 = df[2:3]
>>> df1
    1   2   3
2 NaN NaN NaN

>>> df1.empty
False

How do I check for emptiness (all the elements in a row being None or NaN?)

http://pandas.pydata.org/pandas-docs/version/0.18/generated/pandas.DataFrame.empty.html


Solution

  • You're misunderstanding what empty is for. It's meant to check that the size of a series/dataframe is greater than 0, meaning there are rows. For example,

    df.iloc[1:0]
    
    Empty DataFrame
    Columns: [1, 2, 3]
    Index: []
    
    df.iloc[1:0].empty
    True
    

    If you want to check that a row has all NaNs, use isnull + all:

    df.isnull().all(1)
    
    0    False
    1    False
    2     True
    3    False
    4    False
    5     True
    6     True
    dtype: bool
    

    For your example, this should do:

    df[2:3].isnull().all(1).item()
    True
    

    Note that you can't use item if your slice is more than one row in size.