I'm trying to use the ns()
function from the splines package with a poisson GLM I am using to test for significance of particulate matter concentration (pm.lag0
) on health outcomes (Freq
):
> gfit4 = glm(Freq ~ pm.lag0 + ns(date, df=2), family = poisson(),
data = dt, offset = log(pop))
I get these errors back:
Error in splineDesign(knots, x, ord, derivs, outer.ok = outer.ok) :
must have at least 'ord' knots
In addition: Warning message:
In sort(as.numeric(knots)) : NAs introduced by coercion
Is that not a valid use of ns()
? Can someone help me decode this error message? The splines documentation that R provides doesn't seem to match this error (?ns
).
I can't see a reason why, in principle, it is not possible to use ns()
in glm()
. To see why, study what ns()
does in a formula. From ?ns
> model.frame(weight ~ ns(height, df = 5), data = women)
weight ns(height, df = 5).1 ns(height, df = 5).2 ns(height, df = 5).3 ns(height, df = 5).4 ns(height, df = 5).5
1 115 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00
2 117 7.592323e-03 0.000000e+00 -8.670223e-02 2.601067e-01 -1.734045e-01
3 120 6.073858e-02 0.000000e+00 -1.503044e-01 4.509132e-01 -3.006088e-01
4 123 2.047498e-01 6.073858e-05 -1.677834e-01 5.033503e-01 -3.355669e-01
5 126 4.334305e-01 1.311953e-02 -1.324404e-01 3.973211e-01 -2.648807e-01
6 129 6.256681e-01 8.084305e-02 -7.399720e-02 2.219916e-01 -1.479944e-01
7 132 6.477162e-01 2.468416e-01 -2.616007e-02 7.993794e-02 -5.329196e-02
8 135 4.791667e-01 4.791667e-01 1.406302e-02 2.031093e-02 -1.354062e-02
9 139 2.468416e-01 6.477162e-01 9.733619e-02 2.286023e-02 -1.524015e-02
10 142 8.084305e-02 6.256681e-01 2.707683e-01 6.324188e-02 -4.052131e-02
11 146 1.311953e-02 4.334305e-01 4.805984e-01 1.252603e-01 -5.240872e-02
12 150 6.073858e-05 2.047498e-01 5.954160e-01 1.989926e-01 7.809246e-04
13 154 0.000000e+00 6.073858e-02 5.009718e-01 2.755102e-01 1.627794e-01
14 159 0.000000e+00 7.592323e-03 2.246113e-01 3.520408e-01 4.157556e-01
15 164 0.000000e+00 0.000000e+00 -1.428571e-01 4.285714e-01 7.142857e-01
Which shows that it provides the B-spline basis for the natural spline of the height
variable. Nothing special here.
Hence I suspect your date
variable is not numeric or not something that R could work with by coercing it to be numeric without introducing NA
- see the warning message. Without a reproducible example and information on your data it is impossible to tell however!
In addition, you might want to look at the gam()
function in package mgcv which as a Recommended package is distributed with R. It is designed to fit semi-parametric models in the manner you describe and can include parametric terms as well as smooths/splines of other terms. The package is fairly comprehensive and can fit a large number of types of spline. See it's manual.