I'm trying to graph two normal distributions over two histograms in the same plot in R. Here is an example of what I would like it to look like:
Here is my current code but I'm not getting the second Normal distribution to properly overlay:
g = R_Hist$`AvgFeret,20-60`
m<-mean(g)
std<-sqrt(var(g))
h <- hist(g, breaks = 20, xlab="Average Feret Diameter", main = "Histogram of 60-100um beads", col=adjustcolor("red", alpha.f =0.2))
xfit <- seq(min(g), max(g), length = 680)
yfit <- dnorm(xfit, mean=mean(g), sd=sd(g))
yfit <- yfit*diff(h$mids[1:2]) * length(g)
lines(xfit, yfit, col = "red", lwd=2)
k = R_Hist$`AvgFeret,60-100`
ms <-mean(k)
stds <-sqrt(var(k))
j <- hist(k, breaks=20, add=TRUE, col = adjustcolor("blue", alpha.f = 0.3))
xfit <- seq(min(j), max(j), length = 314)
yfit <- dnorm(xfit, mean=mean(j), sd=sd(j))
yfit <- yfit*diff(j$mids[1:2]) * length(j)
lines(xfit, yfit, col="blue", lwd=2)
and here is the graph this code is generating:
I haven't yet worked on figuring out how to rescale the axis so any help on that would also be appreciated, but I'm sure I can just look that up! Should I be using ggplot2 for this application? If so how do you overlay a normal curve in that library?
Also as a side note, here are the errors generated from graphing the second (blue) line:
To have them on the same scale, the easiest might be to run hist()
first to get the values.
h <- hist(g, breaks = 20, plot = FALSE)
j <- hist(k, breaks = 20, plot = FALSE)
ymax <- max(c(h$counts, j$counts))
xmin <- 0.9 * min(c(g, k))
xmax <- 1.1 * max(c(g,k))
Then you can simply use parameters xlim
and ylim
in your first call to hist()
:
h <- hist(g, breaks = 20,
xlab="Average Feret Diameter",
main = "Histogram of 60-100um beads",
col=adjustcolor("red", alpha.f =0.2),
xlim=c(xmin, xmax),
ylim=c(0, ymax))
The errors for the second (blue) line are because you didn't replace j
(the histogram object) with k
(the raw values):
xfit <- seq(min(k), max(k), length = 314)
yfit <- dnorm(xfit, mean=mean(k), sd=sd(k))
yfit <- yfit*diff(j$mids[1:2]) * length(k)
As for the ggplot2
approach, you can find a good answer here and in the posts linked therein.