Search code examples
rmgcv

Updating a fitted R mgcv::bam model reports an "invalid type (closure)..." error


I am attempting to "bam.update" my fitted mgcv::bam model that uses the "weights" argument and getting the following error:

Error in model.frame.default(gp$fake.formula, data, weights = weights,  : 
  invalid type (closure) for variable '(weights)'

I could not fix this myself so am dropping the problem here. Here is an example code generating exactly the same error I am getting with a much larger model:

library(data.table)
library(mgcv)

mtcars <- data.table(mtcars)

# adding arbitrary "model_weigts"
set.seed(55)
mtcars[, model_weigts := abs(rnorm(nrow(mtcars)))]

# split the dataset
mtcars_1 <- mtcars[1:20,]
mtcars_2 <- mtcars[21:32,]

# an arbitrary model formula 
formula_c <- formula(mpg ~ s(wt) + s(hp))

# fit the initial model to mtcars_1 and attempt to update it with mtcars_2
model_initial <- mgcv::bam(formula_c, data = mtcars_1, weights = model_weigts)
model_updated <- mgcv::bam.update(model_initial, data = mtcars_2)
sessionInfo()
R version 4.3.3 (2024-02-29)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.5

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Australia/Adelaide
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] mgcv_1.9-1        nlme_3.1-164      data.table_1.15.2

loaded via a namespace (and not attached):
[1] compiler_4.3.3 Matrix_1.6-5   splines_4.3.3  grid_4.3.3     lattice_0.22-5

Solution

  • This should probably be a comment, but I can't reproduce your problem

    Actually I can (I thought you loaded mtcars in your example and you didn't so my test when I reset after my checks didn't actually reset to a clean mtcars.

    But that also points to the solution. Take Simon literally in terms of what he writes in ?bam.update:

    Must include a weights column if the original fit was weighted

    I can get this to work if I set up my model to use weights instead of model_weigts [sic]:

    r$> library(data.table) 
        library(mgcv) 
         
        mtcars <- data.table(mtcars) 
         
        # adding arbitrary "model_weigts" 
        set.seed(55) 
        mtcars[, weights := abs(rnorm(nrow(mtcars)))] 
         
        # split the dataset 
        mtcars_1 <- mtcars[1:20,] 
        mtcars_2 <- mtcars[21:32,] 
         
        # an arbitrary model formula  
        formula_c <- formula(mpg ~ s(wt) + s(hp)) 
         
        # fit the initial model to mtcars_1 and attempt to update it with mtcars_2 
        model_initial <- mgcv::bam(formula_c, data = mtcars_1, weights =  weights) 
        model_updated <- mgcv::bam.update(model_initial, data = mtcars_2)           
    r$>
    

    Note no error. And I see

    r$> model_updated$weights                                                       
     [1] 0.120139084 1.812376850 0.151582984 1.119221005 0.001908206 1.188518494
     [7] 0.505343855 0.099234393 0.305353199 0.198409703 0.048910950 0.843233767
    [13] 2.075270771 0.360763148 0.637689661 0.366278030 2.355363898 1.093377235
    [19] 0.285841001 0.993657777 1.519271950 1.497117849 0.819615308 1.066050411
    [25] 0.733755906 0.960343958 0.692180983 1.405998395 1.633538987 0.261830993
    [31] 1.564754415 0.314589303
    r$> mtcars_1$weights                                                            
     [1] 0.120139084 1.812376850 0.151582984 1.119221005 0.001908206 1.188518494
     [7] 0.505343855 0.099234393 0.305353199 0.198409703 0.048910950 0.843233767
    [13] 2.075270771 0.360763148 0.637689661 0.366278030 2.355363898 1.093377235
    [19] 0.285841001 0.993657777
    r$> mtcars_2$weights                                                            
     [1] 1.5192720 1.4971178 0.8196153 1.0660504 0.7337559 0.9603440 0.6921810
     [8] 1.4059984 1.6335390 0.2618310 1.5647544 0.3145893
    r$> all.equal(model_updated$weights, c(mtcars_1$weights, mtcars_2$weights))     
    [1] TRUE
    

    so bam() updated the fit to include the new observations, using their weights, in addition to the original data and weights.

    This is with:

    ─ Session info ───────────────────────────────────────────────────────────────
     setting  value
     version  R version 4.3.3 (2024-02-29)
     os       Ubuntu 20.04.6 LTS
     system   x86_64, linux-gnu
     ui       X11
     language en_GB:en
     collate  en_GB.UTF-8
     ctype    en_GB.UTF-8
     tz       Europe/Copenhagen
     date     2024-04-18
     pandoc   2.5 @ /usr/bin/pandoc
    
    ─ Packages ───────────────────────────────────────────────────────────────────
     package     * version date (UTC) lib source
     cli           3.6.2   2023-12-11 [1] RSPM (R 4.3.2)
     data.table  * 1.15.4  2024-03-30 [1] RSPM (R 4.3.3)
     lattice       0.22-6  2024-03-20 [1] RSPM (R 4.3.3)
     Matrix        1.6-5   2024-01-11 [1] RSPM (R 4.3.2)
     mgcv        * 1.9-1   2023-12-21 [1] CRAN (R 4.3.3)
     nlme        * 3.1-164 2023-11-27 [1] RSPM (R 4.3.2)
     sessioninfo   1.2.2   2021-12-06 [1] RSPM (R 4.3.0)
    
     [1] /home/au690221/R/x86_64-pc-linux-gnu-library/4.3
     [2] /usr/local/lib/R/site-library
     [3] /usr/lib/R/site-library
     [4] /usr/lib/R/library
    
    ──────────────────────────────────────────────────────────────────────────────