Search code examples
rintersectiondensity-plot

Calculate intersection point of two density curves in R


I have two vectors of 1000 values (a and b), from which I created density plots and histograms. I would like to retrieve the coordinates (or just the y value) where the two plots cross (it does not matter if it detects several crossings, I can discriminate them afterwards). Please find the data in the following link. Sample Data

xlim = c(min(c(a,b)), max(c(a,b)))
hist(a, breaks = 100, 
     freq   = F, 
     xlim   = xlim,
     xlab   = 'Test Subject', 
     main   = 'Difference plots',
     col    = rgb(0.443137, 0.776471, 0.443137, 0.5), 
     border = rgb(0.443137, 0.776471, 0.443137, 0.5))
lines(density(a))

hist(b, breaks = 100, 
     freq   = F,
     col    = rgb(0.529412, 0.807843, 0.921569, 0.5),
     border = rgb(0.529412, 0.807843, 0.921569, 0.5),
     add    = T)
lines(density(b))

Using locate() is not optimal, since I need to retrieve this from several plots (but will use that approach if nothing else is viable). Thanks for your help.


Solution

  • We calculate the density curves for both series, taking care to use the same range. Then, we compare whether the y-value for a is greater than b at each x-value. When the outcome of this comparison flips, we know the lines have crossed.

    df <- merge(
      as.data.frame(density(a, from = xlim[1], to = xlim[2])[c("x", "y")]),
      as.data.frame(density(b, from = xlim[1], to = xlim[2])[c("x", "y")]),
      by = "x", suffixes = c(".a", ".b")
    )
    df$comp <- as.numeric(df$y.a > df$y.b)
    df$cross <- c(NA, diff(df$comp))
    points(df[which(df$cross != 0), c("x", "y.a")])
    

    which gives you

    enter image description here