Considering the following plot:
library(ggplot2)
ggplot(mtcars, aes(mpg, wt)) +
geom_point(aes(colour = factor(cyl))) +
scale_y_continuous(name = "Weight", breaks = c(2, 3, 4, 5))
Does anyone know of a way to replace the value of e.g. 5 with a categorical break such as "Above 5", with the three observations appearing on this created break line? I am looking for a way to include outliers in a plot without skewing it yet still be able to show information pertaining to them (in this case, their mpg values) instead of excluding them completely.
The following code:
library(ggplot2)
ggplot(mtcars, aes(mpg, wt)) +
geom_point(aes(colour = factor(cyl))) +
scale_y_continuous(name = "Weight", breaks = c(2, 3, 4, >5), labels = c(2, 3, 4, "Above 5")))
Does not work due to the ">" symbol in breaks. Any suggestions? Thanks.
I found that a simple data manipulation procedure prior to plotting gives what I want.
library(dplyr)
mtcars <- mtcars %>% mutate(wt2 = case_when(wt < 5 ~ wt,
wt > 5 ~ 5))
The above code will assign the value of 5 to any mpg value above 5, so that these appear on the same break line. Then I can plot, and the overlap in points can be shown with change in alpha value.
library(ggplot2)
ggplot(mtcars, aes(mpg, wt2)) +
geom_point(aes(colour = factor(cyl), alpha = 0.2, size = 2)) +
scale_y_continuous(name = "Weight", breaks = c(2, 3, 4, 5), labels = c(2, 3, 4, "Above 5"))
Thank you for your comments.