Search code examples
rggplot2axisthresholdcontinuous

Custom categorical threshold axis break on a continuous scale in R


Considering the following plot:

library(ggplot2)
ggplot(mtcars, aes(mpg, wt)) +
  geom_point(aes(colour = factor(cyl))) +
  scale_y_continuous(name = "Weight", breaks = c(2, 3, 4, 5))

Does anyone know of a way to replace the value of e.g. 5 with a categorical break such as "Above 5", with the three observations appearing on this created break line? I am looking for a way to include outliers in a plot without skewing it yet still be able to show information pertaining to them (in this case, their mpg values) instead of excluding them completely.

The following code:

library(ggplot2)
ggplot(mtcars, aes(mpg, wt)) +
  geom_point(aes(colour = factor(cyl))) +
  scale_y_continuous(name = "Weight", breaks = c(2, 3, 4, >5), labels = c(2, 3, 4, "Above 5")))

Does not work due to the ">" symbol in breaks. Any suggestions? Thanks.


Solution

  • I found that a simple data manipulation procedure prior to plotting gives what I want.

    library(dplyr)
    mtcars <- mtcars %>%  mutate(wt2 = case_when(wt < 5  ~ wt,
                                  wt > 5 ~ 5))
    

    The above code will assign the value of 5 to any mpg value above 5, so that these appear on the same break line. Then I can plot, and the overlap in points can be shown with change in alpha value.

      library(ggplot2)
    ggplot(mtcars, aes(mpg, wt2)) +
      geom_point(aes(colour = factor(cyl), alpha = 0.2, size = 2)) +
      scale_y_continuous(name = "Weight", breaks = c(2, 3, 4, 5), labels = c(2, 3, 4, "Above 5"))
    

    Thank you for your comments.