I cannot add points to the diagram of GAM results displayed with the plot_smooths()
function of tidymv package.
I would like to add points to the diagram of GAM results displayed with the plot_smooths()
function.
For example;
library(mgcv)
library(tidymv)
library(gratia)
x<-c(0.17,0.19,0.18,0.33,0.24,0.2,0.2,0.19,0.36,0.29,0.24,0.24,0.21,0.23,0.31,0.36,0.33,0.24,0.31)
y<-c(8475,86,209,8230,8372,8475,8475,2867,149,8410,8161,8474,8451,282,7409,682,504,66,315)
dat<-data.frame(x,y)
gam <- gam(y ~ s(x), data=dat)
plot_smooths(gam, x)
I wish to add points to the diagram displayed in this state.
I know that the plot.gam()
and daw()
functions of the gratia package display points when residuals = TRUE.
plot.gam(gam,residuals = TRUE)
draw(gam, residuals = TRUE)
However, in this case, the y-axis values differed from the actual data values.
I wanted to plot the values of the data usede.
Please let me know if there is a way to plot the data values used in the plot.gam()
function or the draw()
function.
Thank you in advance.
The easiest way would be to predict from your model to get a nice smooth plot of the estimated smooth function and then add the observed data on top. For example, with gratia::fitted_values()
and a little bit of ggplot()
we have:
library("mgcv")
library("gratia")
x <- c(0.17, 0.19, 0.18, 0.33, 0.24, 0.2, 0.2, 0.19, 0.36, 0.29, 0.24, 0.24,
0.21, 0.23, 0.31, 0.36, 0.33, 0.24, 0.31)
y <- c(8475, 86, 209, 8230, 8372, 8475, 8475, 2867, 149, 8410, 8161, 8474,
8451, 282, 7409, 682, 504, 66, 315)
dat <- data.frame(x, y)
m <- gam(y ~ s(x), data = dat)
# data to predict at, evenly in x, 100 values
ds <- data_slice(m, x = evenly(x, n = 100))
# compute fitted values
fv <- fitted_values(m, data = ds)
fv
looks like this:
> fv
# A tibble: 100 × 6
.row x .fitted .se .lower_ci .upper_ci
<int> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 0.17 5349. 1770. 1880. 8818.
2 2 0.172 5348. 1722. 1973. 8723.
3 3 0.174 5347. 1675. 2064. 8631.
4 4 0.176 5346. 1630. 2150. 8542.
5 5 0.178 5345. 1587. 2234. 8456.
6 6 0.180 5344. 1546. 2314. 8375.
7 7 0.182 5344. 1507. 2391. 8297.
8 8 0.183 5344. 1469. 2464. 8223.
9 9 0.185 5343. 1433. 2534. 8152.
10 10 0.187 5343. 1399. 2601. 8085.
# ℹ 90 more rows
# ℹ Use `print(n = ...)` to see more rows
it is a tibble (fancy data frame) that is suited to plotting with ggplot()
and manipulation with dplyr etc.
Now for the plot
library("ggplot2")
fv |>
ggplot(aes(x = x, y = .fitted)) + # setup
geom_ribbon( # add credible interval
aes(ymax = .upper_ci, ymin = .lower_ci),
fill = "black",
alpha = 0.2
) +
geom_line() + # add estimated smooth
geom_point( # add your data
data = dat,
aes(x = x, y = y)
)
which produces
There are other ways. You could evaluate the smooth (adding on the intercept and a credible interval, and use the draw()
method to plot it and then add on the data as above
m |>
smooth_estimates(select = "s(x)") |> # evaluate smooth
add_constant(model_constant(m)) |> # add on intercept
add_confint() |> # add on credible interval
draw() + # base plot
geom_point( # add your data
data = dat,
aes(x = x, y = y)
)
which gets you a bit more info automagically added to the plot if you like that sort of thing
With {gratia} I've focused on two ends of the spectrum
draw()
methods for the basic plots of smooths, which have some functionality for changing the y-axis scale (via arguments fun
and constant
in gratia:::draw.smooth_estimates()
for example.)The second example I showed is a bit of a hybrid; I used the draw()
method for smooth_estimates()
, which adds a few things, but you could almost as easily just plotted the object returned smooth_estimates()
with:
m |>
smooth_estimates(select = "s(x)") |> # evaluate smooth
add_constant(model_constant(m)) |> # add on intercept
add_confint() |> # add on credible interval
ggplot(aes(x = x, y = .fitted)) + # setup
geom_ribbon( # add credible interval
aes(ymax = .upper_ci, ymin = .lower_ci),
fill = "black",
alpha = 0.2
) +
geom_line() + # add estimated smooth
geom_point( # add your data
data = dat,
aes(x = x, y = y)
)
Once I have got the rest of the package where I want it to be, I do have plans to add more canned plotting functionality and also some ggplot geoms. But I felt it important to first return objects that make it easy to do what the user wants and then build out this other functionality when time permits.