Search code examples
pythonpandasmatplotlibmissing-data

How to visualize missing values patterns in Pandas


I know there are packages for visualizing missing values like missingno. How can I visualize missing values patterns without additional packages using Pandas and Matplotlib? I expect something like the following image where missing data is white:

enter image description here


Solution

  • You can get what you need using matplot:

    import pandas as pd
    plt.rcParams["figure.figsize"] = (20, 10)
    df = pd.read_excel("C:/Users/Jhonny/Desktop/titanic.xlsx")
    plt.imshow(df.isnull(), cmap='hot', aspect='auto')
    plt.show()
    

    note: I used a subset of titanic data from kaggle.

    result:

    Starting from index 0, this heatmap visualization immediately tells us how (and where) missing values are distributed.

    enter image description here

    I know, i'ts not so fancy right now. Matplot takes more work to turn this raw graphic into something nicer.

    But if you want something better and fast, i really suggest seaborn.

    Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.

    import seaborn as sns
    sns.heatmap(df.isnull(), cbar=False)
    plt.show()
    

    enter image description here