Search code examples
rregressioncurveloess

R - loess curve not fitted correctly through points


I am trying to filter out points that are too close or below a loess curve:

The result looks like this:Scatterplot with loess curve

Obviously not the desired outcome.

If however I use the scatter.smooth function I get a correct looking curve: Scatterplot with scatter.smooth curve

How can I correctly fit the loess curve through my data?


Solution

  • Mainly, we should inspect what the predict function returns:

    head(predict(afit))
    [1] 0.8548271 0.8797704 0.8584954 0.8031563 0.9012096 0.8955874
    

    It's a vector, so when we pass it to lines, R says, "okay, you didn't specify an x value, so I'll just use the index for the x values" (try plot(2:10) to see what I mean).

    So, what we need to do is specify a 2 column matrix to pass to lines, instead:

    cbind(sort(means), predict(afit, newdata = sort(means)))

    should do the trick. Your function can be written as:

    FilterByVariance<-function(dat, threshold = 0.90, span = 0.75){
    means <- apply(dat,1,mean) 
    sds <- apply(dat,1,sd) 
    cv <- sqrt(sds/means)
    
    afit<-loess(cv~means, span = span)
    resids<-afit$residuals
    # good<-which(resids >= quantile(resids, probs = threshold)) 
    # points above the curve will have a residual > 0
    good <- which(resids > 0)
    #plots
    
    plot(cv~means)
    lines(cbind(sort(means), predict(afit, newdata = sort(means))), 
          col="blue",lwd=3)
    points(means[good],cv[good],col="red",pch=19)
    
    }