Search code examples
rr-car

Getting the parameters of a data ellipse produced by the car package in R


I am using the dataEllipse function from the car package in R to get an elliptic confidence region for my data. For example:

datapoints_x = c(1,3,5,7,8,6,5,4,9)
datapoints_y = c(3,6,8,9,5,8,7,4,8)
ellipse = dataEllipse(cbind(datapoints_x, datapoints_y), levels=0.95)

The output are two vectors x and y corresponding to points that define the ellipse:

head(ellipse)
#             x        y
# [1,] 12.79906 10.27685
# [2,] 12.74248 10.84304
# [3,] 12.57358 11.34255
# [4,] 12.29492 11.76781
# [5,] 11.91073 12.11238
# [6,] 11.42684 12.37102

But rather than that I am interested in the length of the ellipsis axes and its center. Is there a way to get this without carrying out the PCA by myself?


Solution

  • From ?dataEllipse you read that these functions are mostly plotting functions, not functions designed to give you the fitted ellipse. However reading the source code of dataEllipse, it becomes clear that the function used to fit the ellipse is cov.wt from the stats package. This function should be able to give you the center and covariance matrix used to specify the ellipse location and shape:

    set.seed(144)
    x <- rnorm(1000)
    y <- 3*x + rnorm(1000)
    (ell.info <- cov.wt(cbind(x, y)))
    # $cov
    #          x         y
    # x 1.022985  3.142274
    # y 3.142274 10.705215
    # 
    # $center
    #           x           y 
    # -0.09479274 -0.23889445 
    # 
    # $n.obs
    # [1] 1000
    

    The center of the ellipse is now readily available from ell.info$center. The directions of the axes are accessible as the eigenvectors of the covariance matrix (columns of eigen.info$vectors below).

    (eigen.info <- eigen(ell.info$cov))
    # $values
    # [1] 11.63560593  0.09259443
    # 
    # $vectors
    #           [,1]       [,2]
    # [1,] 0.2839051 -0.9588524
    # [2,] 0.9588524  0.2839051
    

    Finally you need to know the length of the axes (I'll give the length from the center to the ellipse, aka the radius on that axis):

    (lengths <- sqrt(eigen.info$values * 2 * qf(.95, 2, length(x)-1)))
    # [1] 8.3620448 0.7459512
    

    Now we can get the four endpoints of the axes of the ellipse:

    ell.info$center + lengths[1] * eigen.info$vectors[,1]
    #        x        y 
    # 2.279234 7.779072 
    ell.info$center - lengths[1] * eigen.info$vectors[,1]
    #         x         y 
    # -2.468820 -8.256861 
    ell.info$center + lengths[2] * eigen.info$vectors[,2]
    #           x           y 
    # -0.81004983 -0.02711513 
    ell.info$center - lengths[2] * eigen.info$vectors[,2]
    #          x          y 
    #  0.6204643 -0.4506738 
    

    We can confirm these are accurate from using dataEllipse:

    library(car)
    dataEllipse(x, y, levels=0.95)
    

    enter image description here