Search code examples
rinterpolationcubic-spline

How to prevent extrapolation using na.spline()


I'm having trouble with the na.spline() function in the zoo package. Although the documentation explicitly states that this is an interpolation function, the behaviour I'm getting includes extrapolation.

The following code reproduces the problem:

require(zoo)
vector <- c(NA,NA,NA,NA,NA,NA,5,NA,7,8,NA,NA)
na.spline(vector)

The output of this should be:

NA NA NA NA NA NA  5  6  7  8  NA NA

This would be interpolation of the internal NA, leaving the trailing NAs in place. But, instead I get:

-1  0  1  2  3  4  5  6  7  8  9 10

According to the documentation, this shouldn't happen. Is there some way to avoid extrapolation?

I recognise that in my example, I could use linear interpolation, but this is a MWE. Although I'm not necessarily wed to the na.spline() function, I need some way to interpolate using cubic splines.


Solution

  • This behavior appears to be coming from the stats::spline function, e.g.,

    spline(seq_along(vector), vector, xout=seq_along(vector))$y
    # [1] -1  0  1  2  3  4  5  6  7  8  9 10
    

    Here is a work around, using the fact that na.approx strictly interpolates.

    replace(na.spline(vector), is.na(na.approx(vector, na.rm=FALSE)), NA)
    # [1] NA NA NA NA NA NA  5  6  7  8 NA NA
    

    Edit

    As @G.Grothendieck suggests in the comments below, another, no doubt more performant, way is:

    na.spline(vector) + 0*na.approx(vector, na.rm = FALSE)