I would like to plot a threshold model with smooth 95% confidence interval lines between line segments. You would think this would be on the simple side but I have not been able to find an answer!
My threshold/breakpoints are known, it would be great if there were a way to visualize this data. I have tried the segmented package which produces the following plot:
The plot shows a threshold model with a breakpoint at 5.4. However, the confidence intervals are not smooth between regression lines.
If anyone knows of any way to produce smooth (i.e. without the jump between line segments) CI lines between segmented regression lines (ideally in ggplot) that would be amazing. Thank you so much.
I have included sample data and the code I have tried below:
x <- c(2.26, 1.95, 1.59, 1.81, 2.01, 1.63, 1.62, 1.19, 1.41, 1.35, 1.32, 1.52, 1.10, 1.12, 1.11, 1.14, 1.23, 1.05, 0.95, 1.30, 0.79,
0.81, 1.15, 1.10, 1.29, 0.97, 1.05, 1.05, 0.84, 0.64, 0.80, 0.81, 0.61, 0.71, 0.75, 0.30, 0.30, 0.49, 1.13, 0.55, 0.77, 0.51,
0.67, 0.43, 1.11, 0.29, 0.36, 0.57, 0.02, 0.22, 3.18, 3.79, 2.49, 2.44, 2.12, 2.45, 3.22, 3.44, 3.86, 3.53, 3.13)
y <- c(22.37, 18.93, 16.99, 15.65, 14.62, 13.79, 13.09, 12.49, 11.95, 11.48, 11.05, 10.66, 10.30, 9.96, 9.65, 9.35, 9.07, 8.81,
8.56, 8.32, 8.09, 7.87, 7.65, 7.45, 7.25, 7.05, 6.86, 6.68, 6.50, 6.32, 6.15, 5.97, 5.80, 5.63, 5.47, 5.30,
5.13, 4.96, 4.80, 4.63, 4.45, 4.28, 4.09, 3.90, 3.71, 3.50, 3.27, 3.01, 2.70, 2.28, 22.37, 16.99, 11.05, 8.81,
8.56, 8.32, 7.25, 7.05, 6.50, 6.15, 5.63)
lin.mod <- lm(y ~ x)
segmented.mod <- segmented(lin.mod, seg.Z = ~x, psi=2)
plot(x, y)
plot(segmented.mod, add=TRUE, conf.level = 0.95)
which produces the following plot (and associated jumps in 95% confidence intervals):
Background: The non-smoothness in existing change point packages are due to the fact that frequentist packages operate with a fixed change point value. But as with all inferred parameters, this is wrong because there is indeed uncertainty concerning the location of the change.
Solution: AFAIK, only Bayesian methods can quantify that and the mcp
package fills this space.
library(mcp)
model = list(
y ~ 1 + x, # Segment 1: Intercept and slope
~ 0 + x # Segment 2: Joined slope (no intercept change)
)
fit = mcp(model, data = data.frame(x, y))
Default plot (plot.mcpfit()
returns a ggplot
object):
plot(fit) + ggtitle("Default plot")
Each line represents a possible model that generated the data. The posterior for the change point is shown as a blue density. You can add a credible interval on top using plot(fit, q_fit = TRUE)
or plot it alone:
plot(fit, lines = 0, q_fit = c(0.025, 0.975), cp_dens = FALSE) + ggtitle("Credible interval only")
If your change point is indeed known and if you want to model different residual scales for each segment (i.e., quasi-emulate segmented
), you can do:
model2 = list(
y ~ 1 + x,
~ 0 + x + sigma(1) # Add intercept change in residual scale
)
fit = mcp(model2, df, prior = list(cp_1 = 1.9)) # Note: prior is a fixed value - not a distribution.
plot(fit, q_fit = TRUE, cp_dens = FALSE)
Notice that the CI does not "jump" around the change point as in segmented
. I believe that this is the correct behavior. Disclosure: I am the author of mcp
.