I have used ggplot in a loop to generate scatter plots for each of my 200 variables-V1, V2,etc. To make the scatter plots clearer, I would like to be able to label the outliers, automatically. I want to label the points that are greater than the value of the 95th Percentile for each unique variable.
I tried using the code from here-Label points in geom_point, however, this is more of a manual approach to labeling outliers. I have about 200 variables and cannot specify the values for each of them.
Again, the closest solution I could find was from the link above: county_list[i] is the list of the variables that I'm looping over
ggplot(nba, aes(x= county_list[i], y= Afd_2017, colour="green", label=Name))+
geom_point() +
geom_text(aes(label=ifelse(value_of_V[i]>24,as.character(Name),'')),hjust=0,vjust=0)
What I would like is something like this:
ggplot(nba, aes(x= county_list[i], y= Afd_2017, colour="green", label=Name))+
geom_point() +
geom_text(aes(label=ifelse((value_of_V[i] >greater-than-
value-of-the-95-Percentile-of-the-
value_of_V[i]),as.character(Name),'')),hjust=0,vjust=0)
You could create a list of plots using lapply
/map
library(ggplot2)
list_plots <- lapply(nba[-1], function(data)
ggplot(nba, aes(x= MIN, y = data, colour="green", label=Name))+
geom_point() +
geom_text(aes(label= ifelse(data > quantile(data, 0.95),
as.character(Name),'')),hjust=0,vjust=0))
Then you can access individual plots by subsetting the list using [[
list_plots[[6]]
list_plots[[7]]