Search code examples
pythonmatplotlibchartshistogram

How can I put labels in two charts using matplotlib


I'm trying to plot two histogram using the result of a group by. But the labels just appear in one of the labels. How can I put the label in both charts? And how can I put different title for the charts (e.g. first as Men's grade and Second as Woman's grade)

import pandas as pd
import matplotlib.pyplot as plt

microdataEnem = pd.read_csv('C:\\Users\\Lucas\\AppData\\Local\\Programs\\Python\\Python39\\Scripts\\Data Science\\Data Analysis\\Projects\\ENEM\\DADOS\\MICRODADOS_ENEM_2019.csv', sep = ';', encoding = 'ISO-8859-1', nrows=10000)

sex_essaygrade = ['TP_SEXO', 'NU_NOTA_REDACAO']

filter_sex_essaygrade = microdataEnem.filter(items = sex_essaygrade)

filter_sex_essaygrade.dropna(subset = ['NU_NOTA_REDACAO'], inplace = True)
       
filter_sex_essaygrade.groupby('TP_SEXO').hist()
plt.xlabel('Grade')
plt.ylabel('Number of students')


plt.show()

https://i.sstatic.net/lWalb.png


Solution

  • Instead of using filter_sex_essaygrade.groupby('TP_SEXO').hist() you can try the following format: axs = filter_sex_essaygrade['NU_NOTA_REDACAO'].hist(by=filter_sex_essaygrade['TP_SEXO']). This will automatically title each histogram with the group name.

    You'll want to set an the variable axs equal to this histogram object so that you can modify the x and y labels for both plots.

    I created some data similar to yours, and I get the following result:

    import numpy as np
    import pandas as pd
    import matplotlib.pyplot as plt
    
    np.random.seed(42)
    
    sex_essaygrade = ['TP_SEXO', 'NU_NOTA_REDACAO']
    
    ## create two distinct sets of grades
    sample_grades = np.concatenate((np.random.randint(low=70,high=100,size=100), np.random.randint(low=80,high=100,size=100)))
    
    filter_sex_essaygrade = pd.DataFrame({
        'NU_NOTA_REDACAO': sample_grades,
        'TP_SEXO': ['Men']*100 + ['Women']*100
    })
           
    axs = filter_sex_essaygrade['NU_NOTA_REDACAO'].hist(by=filter_sex_essaygrade['TP_SEXO'])
    for ax in axs.flatten():
        ax.set_xlabel("Grade")
        ax.set_ylabel("Number of students")
    
    plt.show()
    

    enter image description here