Search code examples
routliers

Identifying the outliers in a data set in R


So, I have a data set and know how to get the five number summary using the summary command. Now I need to get the instances above the Q3 + 1.5IQR or below the Q1 - 1.5IQR, since these are just numbers - how would I return the instances from a data set which lie above the number or below the number?


Solution

  • You can get this using boxplot. If your variable is x,

    OutVals = boxplot(x)$out
    which(x %in% OutVals)
    

    If you are annoyed by the plot, you could use

    OutVals = boxplot(x, plot=FALSE)$out