Search code examples
rplotpolygonkernel-densitydensity-plot

How can I get density plot with polygon function? I have multiple groups to plot


So I used basic data in R, "iris" data

And what I did so far is creating a vector which has x values with Sepal.Length and y values with density function output. And then I plotted them, but it does not show in 3 different groups (by species) as I intended to.

My intuition is there is something wrong with my grouping...Could you please help me?

<>

x<-iris$Sepal.Length
y<-iris$Species
z<-split(x,y)
x_min <- min(x)
x_max <- max(x)
a <- sapply(z, function(x) density(x,from=x_min,to=x_max))
a
for(i in 1:length(a)){
    polygon(a$x[i],a$y[i])
}

output

And here is the output what it should look like

expected answer

Thank you so much


Solution

  • ggplot is makes grouped operations like this much easier, because you just have to map the grouping variable to the aesthetic used to differentiate the groups (here fill), and it takes care of colors and legends for you. (You can customize further, of course.)

    library(ggplot2)
    
    ggplot(iris, aes(Sepal.Length, fill = Species)) + geom_density()
    

    If you want to do it in base plotting, it's usually easiest to set the empty window first, then iterate over the groups. Since you'll want to set colors as well as groups, Map is more apt than lapply. Legends require an extra call.

    plot(x = c(4, 8.5), y = c(0, 1.5), type = "n", 
         xlab = "Sepal Length", ylab = "Density", 
         main = "Density plot of multiple groups"); 
    
    d <- lapply(split(iris$Sepal.Length, iris$Species), density)
    Map(function(dens, col) polygon(dens, col = col), 
        dens = d, col = c('pink', 'lightgreen', 'lightblue')); 
    
    legend("topright", levels(iris$Species), fill = c('pink', 'lightgreen', 'lightblue'))