I am re-running a script I wrote about a year ago, and i am getting an error when I get to a specific part that uses predictions
from the marginaleffects
package. See below for a minimum reproducible example that uses the flights data from the nycflights13
package.
require(nycflights13)
bA <- lm(as.numeric(arr_delay) ~ hour,
data = flights %>%
filter(month == 4) %>%
filter(distance < 900) %>%
mutate(hour = as.factor(hour)))
bB <- predictions(bA, newdata = datagrid(hour = unique), conf_level = 0.99857) %>%
as_tibble() %>%
arrange(hour) %>%
mutate(estimate =
paste0(sprintf(estimate, fmt = '%#.1f'), " (",
sprintf(conf.low, fmt = '%#.1f'), "; ",
sprintf(conf.high, fmt = '%#.1f'), ")")) %>%
select(hour, estimate)
Error in as.character(at[[n]]) : cannot coerce type 'closure' to vector of type 'character'
The first chunk above runs just fine, but when I attempt to run the second chunk it fails with the cannot coerce type 'closure'...
message.
This code used to run just fine so something must've changed in the package. These are my system/version specs if helpful:
> sessionInfo()
R version 4.4.1 (2024-06-14)
Platform: aarch64-apple-darwin20
Running under: macOS Sonoma 14.6.1
packageVersion('marginaleffects')
[1] ‘0.5.0’
I'd appreciate any thoughts people might have!
As noted by a commenter, your code works fine with the latest version of marginaleffects
(0.24.0). You should not use a (nearly) 2 year-old version like 0.5.0, because many bugs have been fixed since that release.
The error you got was due to the fact that version 0.5.0 of datagrid()
did not support supplying an unevaluated function like unique
. You had to give the actual values, like datagrid(hour=unique(flights$hour))
. The code you posted never worked in version 0.5.0. You must have previously run it with a later version.
AFAICT, the confidence intervals are exactly the same with both versions of marginaleffects
. Make sure you use the same conf_level
in all versions of your code.
Moreover, the confidence intervals are extremely similar to those produced by the base R predict()
function, with differences possibly explainable by numerical precision, and not large enough to be practically meaningful.
require(nycflights13)
library(marginaleffects)
library(dplyr)
bA <- lm(as.numeric(arr_delay) ~ hour,
data = flights %>%
filter(month == 4) %>%
filter(distance < 900) %>%
mutate(hour = as.factor(hour)))
nd = datagrid(hour = unique, model = bA)
p1 <- predictions(bA, newdata = nd, conf_level = 0.99857)
p2 <- predict(bA, newdata = nd, interval = "confidence", level = 0.99857)
all.equal(p1[, "conf.low"], p2[, "lwr"], check.attributes = FALSE)
#> [1] "Mean relative difference: 0.0001351104"
all.equal(p1[, "conf.high"], p2[, "upr"], check.attributes = FALSE)
#> [1] "Mean relative difference: 6.768819e-05"