I'm trying to learn how to highlight and annotate some points in the graph. For the purpose of a reproducible example, I'm using UBSprices dataset in alr4 package.
I'm drawing an ols line and a y=x line. I want to highlight and annotate points that are farthest from the OLS line (that is, highest residuals).
Here's my code so far:
ggplot(UBSprices, aes(x = bigmac2003, y = bigmac2009)) + geom_point() + geom_smooth(method = "lm", se = FALSE) +
geom_abline(color = "green", size = 1) + coord_fixed()
You could calculate the residuals and then identify those with an absolute value greater than some cutoff quantile. For example:
library(tidyverse)
library(alr4)
UBSprices %>%
mutate(resid = resid(lm(bigmac2009 ~ bigmac2003, data = .)),
mark = abs(resid) >= quantile(abs(resid), prob=0.9)) %>%
ggplot(aes(x = bigmac2003, y = bigmac2009)) +
geom_point(aes(colour=mark), show.legend=FALSE) +
geom_smooth(method = "lm", se = FALSE) +
geom_abline(color = "green", size = 1) +
coord_fixed() +
theme_bw() +
scale_colour_manual(values=c("blue", "red"))