In a boxplot
I've set the option outline=FALSE
to remove the outliers.
Now I'd like to include points
that show the mean in the boxplot. Obviously, the means calculated using mean
include the outliers.
How can the very same outliers be removed from a dataframe so that the calculated mean corresponds to the data shown in the boxplot?
I know how outliers can be removed, but which settings are used by the outline
option from boxplot
internally? Unfortunately, the manual does not give any clarifications.
To remove the outliers, you must set the option outline
to FALSE
.
Let's assume that your data are the following:
data <- data.frame(a = c(seq(0,1,0.1),3))
Then, you use the boxplot
function:
res <- boxplot(data, outline=FALSE)
In the res
object, you have several pieces of information about your data. Among these, res$out
gives you all the outliers. Here there is only the value 3.
Thus, to compute the mean without the outliers, you can simply do:
mean(data$a[!data$a %in% res$out])