Search code examples
rggplot2quantileggpubr

How to emulate QQ plot from ggpubr


There are plenty of threads here about using a QQ plot, but I'm trying to figure out how to hand calculate one and in the process I wanted to emulate the one used in ggpubr, as it looks better than the base R version. I have so far at least seemingly produced a QQ plot in base R using this method:

#### Load Libraries ####
library(tidyverse)
library(ggpubr)

#### Organize Data ####
x <- sort(iris$Petal.Length)
rank <- c(1:150)
perc.scale <- scale(rank)
df <- data.frame(x,
                 rank,
                 perc.scale)

#### Plot in Base R ####
plot(perc.scale,
     x,
     xlab = "Theoretical Quantiles",
     ylab = "Sample Quantiles",
     main = "Normal Q-Q Plot")

The result is mostly similar to the base R version, but I haven't figured out the QQ line.

enter image description here

Similarly, I can just use ggqqplot(x) on my data to get this:

enter image description here

But when I try to reproduce it in ggplot with my hand-calculated data:

df %>% 
  ggplot(aes(x=perc.scale,
             y=x))+
  geom_point()+
  geom_smooth(method = "lm",
              color = "black",
              se = F,
              linewidth = .5)+
  theme_pubr()+
  labs(x="Theoretical",
       y="Sample")

It still looks totally different:

enter image description here

My main questions are 1) how do I get the correct regression line and 2) how do I get the standard error area to show up? I'm also unsure of why the ggplot version looks rotated compared to the ggpubr version, but that isn't as important for now.


Solution

  • You could use stat_qq_line to get a qq-plot in ggplot like this:

    #### Load Libraries ####
    library(tidyverse)
    library(ggpubr)
    
    #### Organize Data ####
    x <- sort(iris$Petal.Length)
    rank <- c(1:150)
    perc.scale <- scale(rank)
    df <- data.frame(x,
                     rank,
                     perc.scale)
    
    ggplot(df, aes(sample = x)) +
      stat_qq() +
      stat_qq_line() +
      theme_pubr()+
      labs(x="Theoretical",
           y="Sample")
    

    Created on 2023-01-29 with reprex v2.0.2


    You could use qqplotr with stat_qq_band to add confidence interval like this:

    #### Load Libraries ####
    library(tidyverse)
    library(ggpubr)
    library(qqplotr)
    
    ggplot(df, aes(sample = x)) +
      stat_qq_band() +
      stat_qq_point() +
      stat_qq_line() +
      theme_pubr()+
      labs(x="Theoretical",
           y="Sample")
    

    Created on 2023-01-29 with reprex v2.0.2