Search code examples
rlme4mixed-models

Stop plotting predictions beyond data limits LME ggpredict Effects


Using the 'iris' dataset (sightly modified as below), I plot the results of an LME.

PLEASE NOTE: I am only using the iris dataset as mock data for the purpose of plotting, so please do not critique the appropriateness of this test. I'm not interested in the statistics, rather the plotting.

Using ggpredict function and plotting the results, the plot extends the predictions beyond the range of the data. Is there a systematic way plot predictions only within the range of each faceted data?

I can plot each facet separately, limit the axis per plot manually, and cowplot them back together, but if there is way to say 'predict only to the max. and min. of the data for that group', this would be great. Given that these are facets of a single model, perhaps not showing the predictions for different groups is in fact misleading, and I should rather create three different models if I only want predictions within those data subsets?

library(lme4)
library(ggeffects)
library(ggplot2)

data(iris)
glimpse(iris)
df = iris
glimpse(df)


df_ed = df %>% group_by(Species) %>% mutate(Sepal.Length = ifelse(Species == "setosa",Sepal.Length+10,Sepal.Length+0))
df_ed = df_ed %>% group_by(Species) %>% mutate(Sepal.Length = ifelse(Species == "versicolor",Sepal.Length-3,Sepal.Length+0))
glimpse(df_ed)

m_test = 
  lmer(Sepal.Width ~ Sepal.Length * Species +
  (1|Petal.Width), 
  data = df_ed, REML = T)
summary(m_test)


test_plot = ggpredict(m_test, c("Sepal.Length", "Species"), type = "re") %>% plot(rawdata = T, dot.alpha = 0.6, facet = T, alpha = 0.3)

Predicting beyond data range


Solution

  • Usually, the limit_range argument (for ggeffects versions before 1.3.2 it was limit.range) should help you.

    Note the new argument names, introduced in ggeffects version 1.3.2 (the old argument names still work, though):

    library(lme4)
    library(ggeffects)
    
    data(iris)
    m_test <- lmer(Sepal.Width ~ Sepal.Length * Species + (1 | Petal.Width), data = iris)
    pr <- ggpredict(m_test, c("Sepal.Length", "Species"), type = "re")
    plot(pr, show_data = TRUE, dot_alpha = 0.6, facet = TRUE, alpha = 0.3)
    #> Data points may overlap. Use the `jitter` argument to add some amount of
    #>   random variation to the location of data points and avoid overplotting.
    

    plot(pr, show_data = TRUE, dot_alpha = 0.6, facet = TRUE, alpha = 0.3, limit_range = TRUE)
    #> Data points may overlap. Use the `jitter` argument to add some amount of
    #>   random variation to the location of data points and avoid overplotting.
    

    Created on 2023-11-10 with reprex v2.0.2