So I'm trying to fit a cubic, natural, and smoothing spline to the Auto dataset from the ISLR package. I'm having some trouble and am getting some warning/error messages which makes me think there is something wrong with my data or a matrix that I created.
What is really confusing is how this basic command throws an error.
natural.splines.fit <- lm(horsepower ~ ns(mpg, knots = c(25, 50, 75)), data = Auto)
Error in qr.default(t(const)) : NA/NaN/Inf in foreign function call (arg 1)
There are additional errors/warnings in my code but the thing is: I had essentially copied the code from somewhere and I also ran it, which it worked for the Carseats dataset and modified it to change the variables to match the Auto dataset. This is why it is confusing me. I'm not understanding why I get errors for the Auto dataset but not the Carseats dataset. Does anyone have some insight?
The problem that you have is that you are defining the knots outside the range of the predictor variable. Here is a basic code that will work (I just defined knots that are within the range of the variable mpg).
x <- ISLR::Auto
natural.splines.fit <- lm(horsepower ~ ns(mpg, knots = c(10,20,30,40)), data = x)
summary(natural.splines.fit)
I believe that you are trying to place the knots for the 25th, 50th, and 75th percentile, so I recommend first getting the values corresponding to those locations and then fitting the model. Here is how I did it
target_quantiles <- unname(quantile(x$mpg, probs = c(0.25,0.5,0.75)))
natural.splines.fit2 <- lm(horsepower ~ ns(mpg, knots = target_quantiles), data = x)
summary(natural.splines.fit2)