Search code examples
pythonpandasisnullorempty

How to identity non-empty columns in pandas dataframe?


I'm processing a dataset with around 2000 columns and I noticed many of them are empty. I want to know specifically how many of them are empty and how many are not. I use the following code:

df.isnull().sum()

I will get the number of empty rows in each columns. However, given I'm investigating about 2000 columns with 7593 rows, the output in IPython looks like the following:

FEMALE_DEATH_YR3_RT              7593
FEMALE_COMP_ORIG_YR3_RT          7593
PELL_COMP_4YR_TRANS_YR3_RT       7593
PELL_COMP_2YR_TRANS_YR3_RT       7593
                             ... 
FIRSTGEN_YR4_N                   7593
NOT1STGEN_YR4_N                  7593

It doesn't show all the columns because it has too many columns. Thus it makes it very difficult to tell how many columns are all empty and how any are not. I wonder is there anyway to allow me to identify the non-empty columns quickly? Thanks!


Solution

  • to find the number of non empty columns:
    

    len(df.columns) - len(df.dropna(axis=1,how='all').columns)

    3
    

    df

      Country   movie name  rating  year  Something
    0     thg    John    3     NaN   NaN        NaN
    1     thg     Jan    4     NaN   NaN        NaN
    2     mol  Graham  lob     NaN   NaN        NaN
    
    df=df.dropna(axis=1,how='all')
    
      Country   movie name
    0     thg    John    3
    1     thg     Jan    4
    2     mol  Graham  lob