Search code examples
rlabeloutliers

Label outliers using mvOutlier from MVN in R


I'm trying to label outliers on a Chi-square Q-Q plot using mvOutlier() function of the MVN package in R.

I have managed to identify the outliers by their labels and get their x-coordinates. I tried placing the former on the plot using text(), but the x- and y-coordinates seem to be flipped.

Building on an example from the documentation:

library(MVN)
data(iris)
versicolor <- iris[51:100, 1:3]
# Mahalanobis distance
result <- mvOutlier(versicolor, qqplot = TRUE, method = "quan")
labelsO<-rownames(result$outlier)[result$outlier[,2]==TRUE]
xcoord<-result$outlier[result$outlier[,2]==TRUE,1]
text(xcoord,label=labelsO)

This produces the following: Resulting plot

I also tried text(x = xcoord, y = xcoord,label = labelsO), which is fine when the points are near the y = x line, but might fail when normality is not satisfied (and the points deviate from this line).

Can someone suggest how to access the Chi-square quantiles or why the x-coordinate of the text() function doesn't seem to obey the input parameters.


Solution

  • Looking inside the mvOutlier function, it looks like it doesn't save the chi-squared values. Right now your text code is treating xcoord as a y-value, and assumes that the actual x value is 1:2. Thankfully the chi-squared value is a fairly simple calculation, as it is rank-based in this case.

    result <- mvOutlier(versicolor, qqplot = TRUE, method = "quan")
    labelsO<-rownames(result$outlier)[result$outlier[,2]==TRUE]
    xcoord<-result$outlier[result$outlier[,2]==TRUE,1]
    #recalculate chi-squared values for ranks 50 and 49 (i.e., p=(size:(size-n.outliers + 1))-0.5)/size and df = n.variables = 3
    chis = qchisq(((50:49)-0.5)/50,3)
    text(xcoord,chis,label=labelsO)