I have two datasets and I used the qqplot function in r to compare their distributions. I would like to fit a line through the plot but the qqline function is not appropriate since we have two datasets. Any idea on what I can use please?
x <- rnorm(10000, 12, 3)
y<- rnorm(10000, 18, 5)
qqplot(x,y)
abline(lm(y~x))
Remember that a qqplot of x
versus y
is just the same as plot(sort(x), sort(y))
:
x <- rnorm(10000, 12, 3)
y <- rnorm(10000, 18, 5)
qqplot(x, y)
plot(sort(x), sort(y))
The problem in your example is that you are trying to add the regression line for y on x, but not the sorted versions of x and y. Effectively, you are plotting the regression line for this:
plot(x, y)
Which, not surprisingly, is an almost perfectly flat line with an intercept equal to the mean of y.
Instead, you can regress the sorted y and the sorted x to get the regression line for the qqplot:
x <- rnorm(10000, 12, 3)
y <- rnorm(10000, 18, 5)
qqplot(x, y)
abline(lm(sort(y) ~ sort(x)), col = "red", lwd = 2, lty = 2)