Search code examples
rdensity-plot

Peak Analysis of Density Histogram - R


I'm working with the R language and I had to plot the density comparison of two samples:

samples1:

id1 2

id2 2

id3 2

id4 2

id5 2

id6 2

id7 3

id8 3

id9 3

id10 2

and samples2:

id1 1

id2 1

id3 1

id4 1

id5 1

id6 1

id7 1

id8 2

id9 2

id10 1

The first column is the ID's and the second column is the VALUES of that ID.

Ex.:

Id1 - value.

Id2 - value.

...

  • Note that the two samples have the same size and the same ID's, but different values.

I used the following code to plot this image:

library(ggplot2) 
library(sm)
library(scales)

x <- data.frame(read.table("C:/Users/Filli/Desktop/pasta/samples1.txt"), type = "Agregado")
y <- data.frame(read.table("C:/Users/Filli/Desktop/pasta/samples2.txt"), type = "Temporal")

data <- rbind(x,y)


print(ggplot( data, aes(x = V2, group = type)) +  geom_density(aes(fill= type), size=0, alpha=0.7))

http://prntscr.com/fdespg

What I would like is, how do I find out the id's of the peaks of each density (in red and blue)? In the case, for example, I would like to find the id's that are contained within the red rectangle in the image above for each density (red and blue). I do not want to plot the red rectangle, it was just to illustrate. What I want is to get the id's from each peak (in red and blue) and compare them to see if the i'ds are the same. Consider that there is only one large peak for each density. I want to get only the largest id's from each peak (like the image above) and compare them to see if they are the same or different.


Solution

  • You can use the base R density function, which ggplot uses for density charts. To find the location of the peak, for example, you could use

    density(data$V2)$x[which.max(density(data$V2)$y)]
    

    See ?density for more details.