Search code examples
rvectordifferenceneighbours

Average neighbours inside a vector


My data :

data <- c(1,5,11,15,24,31,32,65)

There are 2 neighbours: 31 and 32. I wish to remove them and keep only the mean value (e.g. 31.5), in such a way data would be :

data <- c(1,5,11,15,24,31.5,65)

It seems simple, but I wish to do it automatically, and sometimes with vectors containing more neighbours. For instance :

data_2 <- c(1,5,11,15,24,31,32,65,99,100,101,140)

Solution

  • Here is another idea that creates an id via cumsum(c(TRUE, diff(a) > 1)), where 1 shows the gap threshold, i.e.

    #our group variable
    grp <- cumsum(c(TRUE, diff(a) > 1))
    
    #keep only groups with length 1 (i.e. with no neighbor)
    i1 <- a[!!!ave(a, grp, FUN = function(i) length(i) > 1)] 
    
    #Find the mean of the groups with more than 1 rows,
    i2 <- unname(tapply(a, grp, function(i)mean(i[length(i) > 1])))
    
    #Concatenate the above 2 (eliminating NAs from i2) to get final result
    c(i1, i2[!is.na(i2)])
    #[1]  1.0  5.0 11.0 15.0 24.0 65.0 31.5
    

    You can also wrap it in a function. I left the gap as a parameter so you can adjust,

    get_vec <- function(x, gap) {
        grp <- cumsum(c(TRUE, diff(x) > gap))
        i1 <- x[!!!ave(x, grp, FUN = function(i) length(i) > 1)]
        i2 <- unname(tapply(x, grp, function(i) mean(i[length(i) > 1])))
        return(c(i1, i2[!is.na(i2)]))
    }
    
    get_vec(a, 1)
    #[1]  1.0  5.0 11.0 15.0 24.0 65.0 31.5
    
    get_vec(a_2, 1)
    #[1]   1.0   5.0  11.0  15.0  24.0  65.0 140.0  31.5 100.0
    

    DATA:

    a <- c(1,5,11,15,24,31,32,65)
    a_2 <- c(1, 5, 11, 15, 24, 31, 32, 65, 99, 100, 101, 140)