I wish to remove all 0 values that have 4 or more consecutive values of 0 with NaN. Your help would be appreciated.
I have the following data :
A B C
1 2 0
0 8 0
0 0 0
0 0 0
0 0 8
0 4 9
What I am looking to get :
A B C
1 2 Na
Na 8 Na
Na 0 Na
Na 0 Na
Na 0 8
Na 4 9
We can use rle
to do this. Loop over the columns with lapply
, use rle
on a logical vector (x==0
) to return the adjacent elements that are same ('values') and its lengths
), assign the 'values' that are not having lengths
greater than or equal to 4 to FALSE, unlist the list with inverse.rle
and use that for replace
ing the values of 'x' to NA
df1[] <- lapply(df1, function(x) {
i1 <- inverse.rle(within.list(rle(x == 0),
values[!(values & lengths >=4)] <- FALSE))
replace(x, i1, NA)
})
df1
# A B C
#1 1 2 NA
#2 NA 8 NA
#3 NA 0 NA
#4 NA 0 NA
#5 NA 0 8
#6 NA 4 9
Or an option with rleid
library(data.table)
library(dplyr)
df1 %>%
mutate(across(everything(), ~ replace(., . == 0 &
ave(., rleid(.), FUN = length) >= 4, NA) ))
df1 <- structure(list(A = c(1L, 0L, 0L, 0L, 0L, 0L), B = c(2L, 8L, 0L,
0L, 0L, 4L), C = c(0L, 0L, 0L, 0L, 8L, 9L)), class = "data.frame",
row.names = c(NA,
-6L))