I would like to generate separate qqplots for all numeric variables in a dataframe to assess univariate normality (only an x variable is required). The plots do not have to be stored as a list -- only displayed in r-studio.
I've tried multiple approaches with no luck including qqline/qqnorm (base r), various iteration of qplot (ggplot2), and qqPlot (EnvStats) in conjunction with apply and a for loop. Below are a few examples. txhousing is from ggplot2.
Use any library you deem appropriate to address the intent of the question.
df <- txhousing
df.num.vec <- names(df)[sapply(df, is.numeric)]
df.num <- df[, df.num.vec]
apply(df.num,2,qqPlot)
This results in a series of errors:
Warning messages:
1: In is.not.finite.warning(x) :
There were 568 nonfinite values in x : 568 NA's
2: In FUN(newX[, i], ...) :
568 observations with NA/NaN/Inf in 'x' removed.
3: In is.not.finite.warning(x) :
There were 568 nonfinite values in x : 568 NA's
4: In FUN(newX[, i], ...) :
568 observations with NA/NaN/Inf in 'x' removed.
5: In is.not.finite.warning(x) :
There were 616 nonfinite values in x : 616 NA's
6: In FUN(newX[, i], ...) :
616 observations with NA/NaN/Inf in 'x' removed.
7: In is.not.finite.warning(x) :
There were 1424 nonfinite values in x : 1424 NA's
8: In FUN(newX[, i], ...) :
1424 observations with NA/NaN/Inf in 'x' removed.
9: In is.not.finite.warning(x) :
There were 1467 nonfinite values in x : 1467 NA's
10: In FUN(newX[, i], ...) :
1467 observations with NA/NaN/Inf in 'x' removed.
df <- txhousing
for (i in seq_along(df)) {
x <- df[[i]]
if (!is.numeric(x)) next
qqPlot(df[,i])
}
This results in:
Error in qqPlot(df[, i]) : 'x' must be a numeric vector
Since you already filtered numeric columns saved in df.num, you can directly use df.num in the for() loop:
#Using ggplot to create qqplots and save in the list
library(ggplot2)
# Create a list
qq_list <- list()
# the for loop
for (var in names(df.num)) {
# Create a Q-Q plot and store it in the list
qq_list[[var]] <- ggplot(df.num, aes(get(var))) +
geom_qq() +
ggtitle(paste0("Q-Q Plot of ", var))
}
# Print the list of plots
print(qq_list)
Using car
library(car)
for (i in 1:ncol(df.num)) {
qqPlot(df.num[, i], main = names(df.num)[i])
}
If you want to save your plots into for example a .pdf file you can do as below:
myqq = "qq.pdf"
pdf(file=myqq)
for (i in 1:ncol(df.num)) {
qqPlot(df.num[, i], main = names(df.num)[i])
}
dev.off()
You can access the pdf file in your working directory with the name of 'qq.pdf'