Search code examples
rggplot2gammgcv

How do you plot smooth components of different GAMs in same panel?


I have two GAMs which have the same predictor variables but different independent variables. I would like to combine the two GAMs to a set of plots where the smooth component (partial residuals) of each predictor variable are in the same panel (differentiated with e.g. color). Reproducible example:

# Required packages
require(mgcv)
require(mgcViz)

# Dataset
data("swiss")

# GAM models
fit1 <- mgcv::gam(Fertility ~ s(Examination) + s(Education), data = swiss)
fit2 <- mgcv::gam(Agriculture ~ s(Examination) + s(Education), data = swiss)

# Converting GAM objects to a gamViz objects
viz_fit1 <- mgcViz::getViz(fit1)
viz_fit2 <- mgcViz::getViz(fit2)

# Make plotGAM objects
trt_fit1 <- plot(viz_fit1, allTerms = T) + l_fitLine()
trt_fit2 <- plot(viz_fit2, allTerms = T) + l_fitLine()

# Print plots
print(trt_fit1, pages = 1)
print(trt_fit2, pages = 1)

Plot of fit1 looks like this:

trt_fit1

And fit2 like this:

trt_fit2

So I would like to combine the two Examinations into one panel, and the two Educations into another one, showing the independent variable (from different GAMs) with different color/linetype.


Solution

  • You could also do this using my {gratia} 📦 and the compare_smooths() function:

    library("gratia")
    library("mgcv")
    
    # Dataset 
    data("swiss") 
    
    # GAM models 
    fit1 <- gam(Fertility ~ s(Examination) + s(Education),
                data = swiss, method = "REML")
    fit2 <- gam(Agriculture ~ s(Examination) + s(Education),
                data = swiss, method = "REML")
    
    # create and object that contains the info to compare smooths
    comp <- compare_smooths(fit1, fit2)
    
    # plot
    draw(comp)
    

    This produces

    enter image description here

    The output from compare_smooth() is a nested data frame (tibble)

    r$> comp                                                                        
    # A tibble: 4 × 5
      model smooth         type  by    data              
      <chr> <chr>          <chr> <chr> <list>            
    1 fit1  s(Education)   TPRS  NA    <tibble [100 × 3]>
    2 fit2  s(Education)   TPRS  NA    <tibble [100 × 3]>
    3 fit1  s(Examination) TPRS  NA    <tibble [100 × 3]>
    4 fit2  s(Examination) TPRS  NA    <tibble [100 × 3]>
    

    So if you want to do customising of the plot etc, you'll need to know how to work with nested data frames or just do

    library("tidyr")
    unnest(comp, data)
    

    which gets you:

    r$> unnest(comp, data)                                                          
    # A tibble: 400 × 8
       model smooth       type  by      est    se Education Examination
       <chr> <chr>        <chr> <chr> <dbl> <dbl>     <dbl>       <dbl>
     1 fit1  s(Education) TPRS  NA     1.19  3.48      1             NA
     2 fit1  s(Education) TPRS  NA     1.37  3.20      1.53          NA
     3 fit1  s(Education) TPRS  NA     1.56  2.94      2.05          NA
     4 fit1  s(Education) TPRS  NA     1.75  2.70      2.58          NA
     5 fit1  s(Education) TPRS  NA     1.93  2.49      3.10          NA
     6 fit1  s(Education) TPRS  NA     2.11  2.29      3.63          NA
     7 fit1  s(Education) TPRS  NA     2.28  2.11      4.15          NA
     8 fit1  s(Education) TPRS  NA     2.44  1.95      4.68          NA
     9 fit1  s(Education) TPRS  NA     2.59  1.82      5.20          NA
    10 fit1  s(Education) TPRS  NA     2.72  1.71      5.73          NA
    # … with 390 more rows
    

    To create your own plots then, we proceed from the unnested data frames and add the confidence interval

    ucomp <- unnest(comp, data) %>%
      add_confint()
    

    Then plot each panel in turn

    library("ggplot2")
    library("dplyr")
    p_edu <- ucomp |>
      filter(smooth == "s(Education)") |> # <-- only one comparison at a time
      ggplot(aes(x = Education, y = est)) +
        geom_ribbon(aes(ymin = lower_ci, ymax = upper_ci, fill = model),
                    alpha = 0.2) +
        geom_line(aes(colour = model)) + 
        scale_fill_brewer(palette = "Set1") +   # <-- change fill scale
        scale_colour_brewer(palette = "Set1") + # <-- change colour scale
        geom_rug(data = swiss,                  # <-- rug
                 mapping = aes(x = Education, y = NULL),
                 sides = "b", alpha = 0.4) +  
        labs(title = "s(Education)", y = "Estimate",
             colour = "Model", fill = "Model")
    
    p_exam <- ucomp |>
      filter(smooth == "s(Examination)") |>
      ggplot(aes(x = Examination, y = est)) +
        geom_ribbon(aes(ymin = lower_ci, ymax = upper_ci, fill = model),
                    alpha = 0.2) +
        geom_line(aes(colour = model)) + 
        scale_fill_brewer(palette = "Set1") +   # <-- change fill scale
        scale_colour_brewer(palette = "Set1") + # <-- change colour scale
        geom_rug(data = swiss,                  # <-- rug
                 mapping = aes(x = Examination, y = NULL),
                 sides = "b", alpha = 0.4) +  
        labs(title = "s(Examination)", y = "Estimate",
             colour = "Model", fill = "Model")
    

    Now use the {patchwork} package to put the plots together

    library("patchwork")
    p_edu + p_exam + plot_layout(guides = "collect")
    

    which produces

    enter image description here

    This is all using {ggplot2} so you'll need to look at other scales if you want more control over the colours ?scale_fill_manual for example or provide other ready-made discrete scales if you want to use an existing palette.

    I could make some of this easier in {gratia} - I could allow users to provide a scale to be used for the colour and fill, and also if they supply the raw data I could draw the rugs too.