Search code examples
rggplot2boxplotoutliers

Remove outliers from stat_summary in ggplot2


I have this part of code to produce boxplot with my data:

p <- ggplot(meltData, aes(x=variable, y=value)) + 
  geom_boxplot()+  geom_boxplot(outlier.colour="red", outlier.shape=1,outlier.size=2)+
stat_summary(geom="text", fun=quantile,
             aes(label=sprintf("%1.1f", ..y..), color=factor(variable)),
             position=position_nudge(x=0.0), size=3.5,show_guide = FALSE)+
  ggtitle("Species measurements")+
  ggeasy::easy_center_title()
p

and I have this output: enter image description here

I want to be able to see uppper and lower whisker numbers on my boxplot as maximum and minimum values (and not the outliers numbers). For example, on the 5th boxplot we can see that the max number is 72, but this is an outlier and the max should be at 56 approximately.


Solution

  • If I understand your purpose correctly, you want to create boxplots along with texts that show the upper and lower whisker numbers and no outliers should be shown in the plots. If that's true, then I agree with @Death Metal that you might want to filter the outliers per category.

    However, because you don't provide a reproducible data, here is a dummy data similar to your data.

    dat <- data.frame(var.A = c(iris$Sepal.Length, c(20,21,22)), 
                      var.B = c(iris$Petal.Length, c(20,21,22)))
    meltData <- dat %>% pivot_longer(cols = c(var.A, var.B), 
                                     values_to = "value", 
                                     names_to = "variable")
    
    ggplot(meltData, aes(x=variable, y=value)) + geom_boxplot()
    

    which clearly shows outliers

    enter image description here

    Here is on of the ways to filter the outliers before applying boxplots:

    meltData %>% group_by(variable) %>%
         filter(value != (boxplot(value))$out) %>% 
         ggplot(aes(x = variable, y = value)) + 
         geom_boxplot() + stat_summary(geom="text", 
                                       fun=quantile,aes(label=sprintf("%1.1f", ..y..), 
                                                        color=factor(variable)),
                                       position=position_nudge(x=0.0), 
                                       size=3.5,show_guide = FALSE)+
         ggtitle("Species measurements")+
         ggeasy::easy_center_title()
    #Warning message:
    #`show_guide` has been deprecated. Please use `show.legend` instead. 
    

    The result:

    enter image description here