R package DHARMa detects deviations from normality even if a mixed model is fitted to gaussian-generated data

[Main edits in italics]

Using R I generated data from normal distributions but, when I fitted a random-intercept mixed-effect model to them, both the Kolmogorov-Smirnoff test for normality and the Quantile-Quantile plot performed by the R package DHARMa indicated deviations from normality. Why is that?

Test 1

I generated a set of random-intercept data in which both the number of replicates and the within-group standard deviation varies for each level of the random effect:

rm(list=ls())
library(DHARMa)
library(glmmTMB)
library(ggplot2)

# "regression" with an 8-level RE with variable slopes:
set.seed(666)
ii=8 # parameter set size; no. groups
reps <- sample(x=8:12, size=ii, replace=T)
slope <- 3
intercept <- runif(ii, min = 70, max = 140)
group.sd <- runif(ii, min = 5, max = 15)
group.ID <- LETTERS[1:ii]

xx <- 1:15

out.list <- list() # empty list to store simulated data

for (i in 1:ii){
        df00 <- data.frame(Group=rep(group.ID[i], reps[i]*length(xx)),
                                    x=rep(xx, reps[i]))
        df00$y = intercept[i] + df00$x * slope + rnorm(length(df00[,1]), 0, group.sd[i])
        out.list[[i]] = df00
        }

# to turn out.list into a data.frame:
df <- do.call(rbind.data.frame, out.list)

# data visualisation
ggplot(df, aes(x = x, y = y, colour = Group)) + 
    geom_point() + geom_smooth(method="lm") + theme_classic()

If I fit a random-intercept model to my dataset:

glmmTMB_ri <- glmmTMB(y~x + (1|Group), data=df)

These are the diagnostic plots and tests produced by DHARMa:

mem1Resids <- simulateResiduals(glmmTMB_ri, n = length(df[, 1]) * 3)
par(mfrow = c(1, 2))
plotQQunif(mem1Resids)
mtext(text = "(a)", side = 3, adj = 0, line = 2)
plotResiduals(mem1Resids, quantreg = T)
mtext(text = "(b)", side = 3, adj = 0, line = 2)

Test 2

I thought that the issue could that the resolution of the simulated residuals was too low. Increasing the number of simulated observations by simulateResiduals() from three times the size of my dataset to 10 times the size of my dataset doesn't improve the diagnostic plots:

Test 3

I thought the issue might not be with DHARMa's settings but with my dataset (perhaps too small or too variable). I generated a new dataset with larger, uniform replication levels for each random group (40 observations for each value of x for each group) and with lower, uniform whithin-group variability (SD=5 for all random groups):

reps <- rep(40, length.out=ii)
# let's also reduce and uniformise maximum within-group variability:
group.sd <- rep(5, ii)

out.list <- list() # empty list to store simulated data

for (i in 1:ii){
        df00 <- data.frame(Group=rep(group.ID[i], reps[i]*length(xx)),
                                    x=rep(xx, reps[i]))
        df00$y = intercept[i] + df00$x * slope + rnorm(length(df00[,1]), 0, group.sd[i])
        out.list[[i]] = df00
        }

# to turn out.list into a data.frame:
df2 <- do.call(rbind.data.frame, out.list)

# data visualisation
ggplot(df2, aes(x = x, y = y, colour = Group)) +
    geom_point() + geom_smooth(method="lm") + theme_classic()

If I fit a random-intercept model to my new dataset:

glmmTMB_ri2 <- glmmTMB(y~x + (1|Group), data=df2)

These are the diagnostic plots and tests produced by DHARMa:

mem2Resids <- simulateResiduals(glmmTMB_ri2, n = length(df2[, 1]) * 3)
par(mfrow = c(1, 2))
plotQQunif(mem2Resids)
mtext(text = "(a)", side = 3, adj = 0, line = 2)
plotResiduals(mem2Resids, quantreg = T)
mtext(text = "(b)", side = 3, adj = 0, line = 2)

If anything, the diagnostic plots show a worse situation than they did before.

Why does DHARMa indicate non-normality of my models' residual's distributions even if I generated my data to be gaussian? If the problem isn't with the sample size or with DHARMa, perhaps I simulated my data incorrectly, but that doesn't seem to be the case either.

Solution

Disclaimer: I'm the developer of DHARMa.

First of all, for DHARMa-specific questions, please also consider using our issue tracker https://github.com/florianhartig/DHARMa/issues

Second, regarding your question / results: when you simulate with difference SDs for the REs, you violate the assumptions of the mixed models, which assumes REs come from a normal distribution with a common standard deviation. EDIT: plus, it seems to me you are drawing the random intercepts from a uniform distribution, whereas a standard mixed model assumes normal.

Thus, I'm not sure: why are you concerned? I would say it's rather encouraging that this problem is highlighted by DHARMa. If your question is about the specific reason why this is highlighted - the DHARMa default is to re-simulate unconditionally, i.e. with REs. If you would switch to simulations conditional on the fitted REs, which corresponds closer to standard residual checkes, the problem would likely be less visible.

Finally, note that DHARMa has a function createData() that allows you to create data according to the assumptions of a mixed model.