Search code examples
pythonmatplotlibstatisticsseabornkernel-density

How can I find the mode (a number) of a kde histogram in python


I want to determine the X value that has the highest pick in the histogram.

The code to print the histogram:

fig=sns.displot(data=df, x='degrees', hue="TYPE", kind="kde",  height=6, aspect=2)
plt.xticks(np.arange(10, 20, step=0.5))
plt.xlim(10, 20)
plt.grid(axis="x")

Histogram and value wanted (in fact, I would like all 4):


Solution

  • You will need to retrieve the underlying x and y data for your lines using matplotlib methods.

    If you are using displot, as in your excerpt, then here is a solution on a toy dataset with two groups that both prints the x value and plots a vertical line for that value. The x value is obtained by first finding the largest y value and then using the index of that value to locate the x value.

    import matplotlib.pyplot as plt
    import numpy as np
    import pandas as pd
    from seaborn import displot
    
    np.random.RandomState(42)
    
    d1 = pd.DataFrame({'x': np.random.normal(3, 0.2, 100), 'type': 'd1'})
    d2 = pd.DataFrame({'x': np.random.normal(3.3, 0.3, 100), 'type': 'd2'})
    
    df = pd.concat([d1,d2], axis=0, ignore_index=True)
    
    my_kde = displot(data=df, x='x', hue='type', kind='kde')
    
    axes = my_kde.axes.flatten()
    
    for i, ax in enumerate(axes):
        max_xs = []
        for line in ax.lines:
            max_x = line.get_xdata()[np.argmax(line.get_ydata())]
            print(max_x)
            max_xs.append(max_x)
        for max_x in max_xs:
            ax.axvline(max_x, ls='--', color='black')
    
    # 3.283798164938401
    # 3.0426118489704757
    

    enter image description here

    If you decide to use kdeplot, then the syntax is slightly different:

    import matplotlib.pyplot as plt
    import numpy as np
    import pandas as pd
    from seaborn import kdeplot
    
    np.random.RandomState(42)
    
    d1 = pd.DataFrame({'x': np.random.normal(3, 0.2, 100), 'type': 'd1'})
    d2 = pd.DataFrame({'x': np.random.normal(3.3, 0.3, 100), 'type': 'd2'})
    
    df = pd.concat([d1,d2], axis=0, ignore_index=True)
    
    fig, ax = plt.subplots()
    
    my_kde = kdeplot(data=df, x='x', hue='type', ax=ax)
    
    lines = my_kde.get_lines()
    
    for line in lines:
        x, y = line.get_data()
        print(x[np.argmax(y)])
        ax.axvline(x[np.argmax(y)], ls='--', color='black')
    
    # 3.371128998664264
    # 2.944974720030946
    

    enter image description here