Search code examples
rsmoothing

How should I smooth the outliers on a 3D surface?


I have taken a 3D surface from package akima.

library(akima)
data(akima)
matrix_1 <- akima::interp(x = akima$x, y = akima$y, z = akima$z)
persp(matrix_1$x, matrix_1$y, matrix_1$z)

3D Surface

To this surface, I have added some outlier data points.

# Add outliers
akima$x = c(akima$x, 11, 12, 13)
akima$y = c(akima$y, 11, 12, 13)
akima$z = c(akima$z, 70, 80, 75)
matrix_2 <- akima::interp(x = akima$x, y = akima$y, z = akima$z)
persp(x = matrix_2$x, y = matrix_2$y, z = matrix_2$z)

The new surface looks like - 3D Surface with outliers

Now, I want to smooth these outliers to the original surface without disturbing the original surface but I could not accomplish that. Here is what I have tried and I will appreciate it if someone can show me how to do this.

# Trying to smooth and remove outliers
library(mgcv)
mod <- gam(z ~ te(x) + te(y) + ti(x, y), data = akima)
akima$new = predict(mod, akima)

matrix_3 <- akima::interp(x = akima$x, y = akima$y, z = akima$new)
persp(matrix_3$x, matrix_3$y, matrix_3$z)

Here is the surface after smoothing with gam, which is not the expected output. The expected output is the first graph with the new values of outliers.

Incorrect 3D surface disturbed by outliers


Solution

  • This is more of a statistical question than a computing question, but one approach would be to look at the size of the residuals from your gam() fit, and only modify the z coordinate if the residual is particularly large.

    For example:

    library(akima)
    data(akima)
    
    akima$x <- c(akima$x, 11, 12, 13)
    akima$y <- c(akima$y, 11, 12, 13)
    akima$z <- c(akima$z, 70, 80, 75)
    
    # Trying to smooth and remove outliers
    library(mgcv)
    #> Warning: package 'mgcv' was built under R version 4.1.2
    #> Loading required package: nlme
    #> Warning: package 'nlme' was built under R version 4.1.2
    #> This is mgcv 1.8-39. For overview type 'help("mgcv-package")'.
    mod <- gam(z ~ te(x) + te(y) + ti(x, y), data = akima)
    akima$pred <- predict(mod, akima)
    akima$resid <- akima$z - akima$pred
    hist(akima$resid)
    

    akima$new <- ifelse(akima$resid < 25, akima$z, akima$pred)
    matrix_3 <- akima::interp(x = akima$x, y = akima$y, z = akima$new)
    persp(matrix_3$x, matrix_3$y, matrix_3$z)
    

    Created on 2022-03-29 by the reprex package (v2.0.1)

    I just chose the cutoff of 25 by eye; if you want to use a more automatic method, you'll need to ask about that on stats.stackexchange.com . You could also iterate a bit, to get better predictions without including the outliers. Again, that's a stats question.