I am using the dataEllipse function from the car package in R to get an elliptic confidence region for my data. For example:
datapoints_x = c(1,3,5,7,8,6,5,4,9)
datapoints_y = c(3,6,8,9,5,8,7,4,8)
ellipse = dataEllipse(cbind(datapoints_x, datapoints_y), levels=0.95)
The output are two vectors x and y corresponding to points that define the ellipse:
head(ellipse)
# x y
# [1,] 12.79906 10.27685
# [2,] 12.74248 10.84304
# [3,] 12.57358 11.34255
# [4,] 12.29492 11.76781
# [5,] 11.91073 12.11238
# [6,] 11.42684 12.37102
But rather than that I am interested in the length of the ellipsis axes and its center. Is there a way to get this without carrying out the PCA by myself?
From ?dataEllipse
you read that these functions are mostly plotting functions, not functions designed to give you the fitted ellipse. However reading the source code of dataEllipse
, it becomes clear that the function used to fit the ellipse is cov.wt
from the stats
package. This function should be able to give you the center and covariance matrix used to specify the ellipse location and shape:
set.seed(144)
x <- rnorm(1000)
y <- 3*x + rnorm(1000)
(ell.info <- cov.wt(cbind(x, y)))
# $cov
# x y
# x 1.022985 3.142274
# y 3.142274 10.705215
#
# $center
# x y
# -0.09479274 -0.23889445
#
# $n.obs
# [1] 1000
The center of the ellipse is now readily available from ell.info$center
. The directions of the axes are accessible as the eigenvectors of the covariance matrix (columns of eigen.info$vectors
below).
(eigen.info <- eigen(ell.info$cov))
# $values
# [1] 11.63560593 0.09259443
#
# $vectors
# [,1] [,2]
# [1,] 0.2839051 -0.9588524
# [2,] 0.9588524 0.2839051
Finally you need to know the length of the axes (I'll give the length from the center to the ellipse, aka the radius on that axis):
(lengths <- sqrt(eigen.info$values * 2 * qf(.95, 2, length(x)-1)))
# [1] 8.3620448 0.7459512
Now we can get the four endpoints of the axes of the ellipse:
ell.info$center + lengths[1] * eigen.info$vectors[,1]
# x y
# 2.279234 7.779072
ell.info$center - lengths[1] * eigen.info$vectors[,1]
# x y
# -2.468820 -8.256861
ell.info$center + lengths[2] * eigen.info$vectors[,2]
# x y
# -0.81004983 -0.02711513
ell.info$center - lengths[2] * eigen.info$vectors[,2]
# x y
# 0.6204643 -0.4506738
We can confirm these are accurate from using dataEllipse
:
library(car)
dataEllipse(x, y, levels=0.95)