Search code examples
pythonpandasmissing-data

count missing value by index groups


I just want to count all NA values grouped by the first index, name_of_collection. And print the each collection name corresponding to their number of NA values. Could anybody help me? Thank you so much!! The expected output:

name_of_collection # of NA
autoglyphs_Data_Clean 48 (for example)
veefriends_Data_Clean 57 (for example)

dataset:


Solution

  • Let "col" be the name of the column where you're looking for the NA values and df your dataframe. Then this should work :

    df["is_na"] = df["col"].isna()
    df.groupby("name_of_collection")["is_na"]
    .sum()
    .reset_index()
    .rename(columns={"is_na":"# of NA"})