Search code examples
roptimizationloess

R function to calculate SSE for loess span optimisation - can't find error in code


The code comes from http://r-statistics.co/Loess-Regression-With-R.html, and it is exactly what I would need, if I could only make it work. Data:

data(economics, package="ggplot2") 
economics$index <- 1:nrow(economics)
economics <- economics[1:80, ] 

Function to calculate SSE:

calcSSE <- function(x){
  loessMod <- try(loess(uempmed ~ index, data=economics, span=x), silent=T)
  res <- try(loessMod$residuals, silent=T)
  if(class(res)!="try-error"){
    if((sum(res, na.rm=T) > 0)){
      sse <- sum(res^2)  
    }
  }else{
    sse <- 99999
  }
  return(sse)
}

Finding out the best 'span' value for loess:

optim(par=c(0.5), calcSSE, method="SANN")

The problem is that this should yield a span of 0.0543 and an achievable minimum SSE of 3.85e-28 for a function count of 10000 (this can be checked at the link at the beginning of this post), but if I run it, it gives 0.5, SSE 99999 and the function count is only 2, so it obviously isn't working.

I am running R 4.1.2 under Win10 from RStudio. The only thing I can think of is that something major in R has been changed since this code was written. Any help in fixing it would be greatly appreciated.


Solution

  • I think the problem is here: if((sum(res, na.rm=T) > 0)){ - this unnecessarily triggers a move to sse <- 99999 when the sum of the residuals could quite reasonably be below zero. If you remove that condition, then you get a more reasonable answer. The try() function should catch any wonky stuff going on in the results.

    data(economics, package="ggplot2") 
    economics$index <- 1:nrow(economics)
    economics <- economics[1:80, ] 
    
    
    calcSSE <- function(x){
      loessMod <- try(loess(uempmed ~ index, data=economics, span=x), silent=T)
      res <- try(loessMod$residuals, silent=T)
      if(!inherits(res, "try-error")){
          sse <- sum(res^2)  
      }else{
        sse <- 99999
      }
      return(sse)
    }
    
    optimize(calcSSE, c(0.01,1))
    #> $minimum
    #> [1] 0.04414211
    #> 
    #> $objective
    #> [1] 7.888609e-31
    

    Created on 2022-01-30 by the reprex package (v2.0.1)

    Note that I used optimize instead of optim because that's suggested if there is only one parameter being optimized. I also removed the warnings from the output. One other note, I switched class(res)!="try-error" to !inherits(res, "try-error") as this is encouraged because classes can often have more than one value.