Search code examples
pythondataframemissing-data

In Python, how to view the percentage of missing values per each column?


I am a new Data Scientist, and I am trying to write a code that will calculate the percentage of missing values per each column in a data frame.

Here is a reproducible code:

my_df = pd.DataFrame([[None, 2, 3],
                     [4, None, 6],
                     [7, 8, None]])

In this code, each column contains 33.3% of missing values. The code that I developed to try to solve my own problem is as follows:

my_df.isnull().sum() / my_df.count()

This code outputs that there are 0.5 for percentage of missing values per column, because as I learned by developing this code the function count() does not consider missing values and counts only non-null values.

How can I overcome this problem and get the correct answer to this problem that states that there the % of missing values per each column is 0.33, and not 0.5?

Thank you!


Solution

  • give this a try:

    my_df.isnull().sum()/len(my_df)