Search code examples
pythonmatplotlibseabornviolin-plotcatplot

Seaborn Catplot Violin plotting age distribution in negative integer


I am trying to plot age distribution regarding survived, sex, class variables.

from matplotlib import pyplot
import seaborn

titanic= seaborn.load_dataset("titanic")

g = seaborn.catplot(data = titanic, x = 'survived', y = 'age',
                    hue = 'sex', split = True,
                    row='class', kind ='violin', legend = False)

Result is shown in the picture below.

If you see the age distribution of the first subplot where I draw a circle around, you can see that it is plotted on negative number which doesn't make sense.

How can I solve this problem? Age data does not contain any negative numbers.

enter image description here


Solution

  • The particular violin plot you circled is based on only 3 values: [2, 25, 50]. The violin plot draws a kernel density estimate obtained with these 3 points. In your case, the KDE has a significant portion below zero.

    If you want, you can limit the plotting range of the violin plots to the range of the observed data by adding the parameter cut = 0 (cf. violinplot).