Search code examples
pythonpandasmatplotlibboxplot

How to plot multiple boxplots in one figure with an optional possibility to group the values of the boxplots by a categorical varibale?


I have a pandas dataFrame which can hold n columns, for example this one:

df = pd.DataFrame({
    'Löwenbräu-Festzelt': [1.068510948, 1.111444388, 1.097928649, 1.097319892, 1.112046892, 1.096458863, 1.098193952, 1.105528912, 1.081012023, 1.096862587, 1.096820787, 1.112483864, 1.090409846, 1.076176749, 1.05914969, 1.111281072, 1.090280455, 1.071867235, 1.104982445, 1.074247709, 1.103154487, 1.136741808, 1.051554041, 1.089669195, 1.126347645, 1.105658808, 1.117330659, 1.101642591, 1.065208517, 1.082705561, 1.081997508, 1.100248942, 1.102306684, 1.106034801, 1.061078385, 1.065105824, 1.118714312, 1.103743509, 1.10806331, 1.127161842, 1.095313864, 1.083297614, 1.088053678, 1.096490414, 1.103947732, 1.070520785, 1.096987797, 1.045452588, 1.097941923, 1.087059407],
    'Festzelt Tradition': [1.059211299, 1.004684239, 0.998865106, 1.011393955, 1.020030032, 1.000917207, 1.037486604, 1.058419981, 0.999914939, 1.037276828, 1.011935826, 1.00550927, 1.035434798, 1.049929295, 1.023505819, 1.04547058, 1.019198865, 1.01983709, 1.011544282, 1.019780386, 1.001639294, 1.027859424, 1.060448349, 1.047727746, 1.020635143, 1.030990766, 1.003855964, 1.024180945, 1.033970302, 1.024973412, 1.046135278, 1.031333533, 1.037277845, 1.023052959, 1.046540625, 1.014640256, 1.009600155, 0.988617146, 0.993939951, 1.019822804, 0.980809392, 1.034884526, 1.039759923, 1.019183791, 0.980610209, 1.015745219, 0.982644572, 1.019548832, 1.03694442, 0.984112046],
    'Hofbräu Festzelt': [1.034212037, 1.027636547, 1.041131616, 1.015574061, 1.027518052, 1.031624001, 1.055728657, 1.028382302, 1.041745618, 1.032410921, 1.034610184, 1.020176817, 1.02239158, 1.036213718, 1.037768708, 1.025644106, 1.014052738, 1.032167174, 1.033173547, 1.037726475, 1.040055401, 1.035740735, 1.03216279, 1.023292892, 1.025439416, 1.018466613, 1.026367696, 1.03792821, 1.036023857, 1.039614743, 1.049249344, 1.03696263, 1.022792738, 1.005627694, 1.036757788, 1.030524617, 1.037090724, 1.045428404, 1.033092142, 1.030637899, 1.011694064, 1.033305161, 1.026014978, 1.028626086, 1.048727062, 1.029920593, 1.032974373, 1.043707642, 1.020915468, 1.020695672]
})

I a second dataFrame provide the categorical values:

ad = pd.DataFrame({
"Category": ["mittags", "mittags", "mittags", "mittags", "mittags", "mittags", "mittags", "mittags", "mittags", "mittags", "mittags", "mittags", "mittags", "mittags", "mittags", "mittags", "mittags", "mittags", "mittags", "mittags", "mittags", "mittags", "mittags", "mittags", "mittags", "abends", "abends", "abends", "abends", "abends", "abends", "abends", "abends", "abends", "abends", "abends", "abends", "abends", "abends", "abends", "abends", "abends", "abends", "abends", "abends", "abends", "abends", "abends", "abends", "abends"]
})

I want to generate an output like in this image when no categorical variable is provided: enter image description here

And an output like this when the cat var is provided: enter image description here


Solution

  • Here is a proposition that may fulfill your request or at least give me some insights :

    For the first plot, you can simply use pandas' boxplot :

    (
        df.boxplot(
            showfliers=False, showcaps=False, patch_artist=True,
            boxprops={"color": "black", "linewidth": 0.5, "facecolor": "#71a9de"},
            medianprops={"color": "black", "linewidth": 0.5}, figsize=(8, 5),
    );
    

    enter image description here


    For the second one, you can try something like :

    fig, axes = plt.subplots(1, 2, figsize=(8, 5), sharex=True, sharey=True)
    
    df.join(ad).groupby("Category").boxplot(
        rot=45, showfliers=False, showcaps=False, patch_artist=True,
        ax=axes, boxprops={"color": "black", "linewidth": 0.5, "facecolor": "#71a9de"},
        medianprops={"color": "black", "linewidth": 0.5, }
    )
    
    plt.suptitle(f'Boxplot für {"; ".join(df.columns)}',
                 x=0.5, y=0.95, ha="center", fontsize="large")
    
    plt.tight_layout()
    
    plt.show();
    

    enter image description here