I have a nested list (or list-of lists) containing all integers. Some of the nested lists have NA
values randomly allocated from a certain position in the nested list. I need to remove all of the positions in the nested list after the first NA
appears.
For one example, in my sample data below of a list of 5 nested lists, L.miss
, position L.miss[[2]]
is a list of nine vectors of integers of varying length. The first NA
appears on position L.miss[[2]][[4]][3]
so any(is.na(L.miss[[2]][[4]]))
returns TRUE
. In my desired output, positions L.miss[[2]][4:9]
need to be removed. The list L.want
is the desired outcome.
L.miss <- list(list(1,3,c(0,2,0),c(NA)),
list(1,6,c(0,3,2,0,1,0),c(0,0,NA,1,0,0),1,2,c(NA,1),2,c(0,0)),
list(1,0),
list(1,0),
list(1,4,c(2,0,0,0),c(4,1),c(1,NA,0,0,0),0),
list(1,0))
L.want <- list(list(1,3,c(0,2,0)),
list(1,6,c(0,3,2,0,1,0)),
list(1,0),
list(1,0),
list(1,4,c(2,0,0,0),c(4,1),
list(1,0))
My attempt was to iterate through the list positions and assign a NULL
value:
try <- L.miss
for (i in 1:length(try)){
for (k in 1:length(try[[i]])){
if (any(is.na(try[[i]][[k]]))){
try[[i]][k:length(try[[i]])] <- NULL
}
}
}
But this returns the error: Error in try1[[i]][[k]] : subscript out of bounds
.
I am assuming this is because it initiates the k
for loop at the full length of the nested list, then removes an element so now goes out of bounds, but I don't know of any alternatives to this solution despite an exhaustive search.
Any advice is appreciated!
Here's a way :
out_list <- lapply(L.miss, function(x) {
inds <- sapply(x, function(x) any(is.na(x)))
if(any(inds)) x[seq_len(which.max(inds) - 1)] else x
})
out_list[[2]]
#[[1]]
#[1] 1
#[[2]]
#[1] 6
#[[3]]
#[1] 0 3 2 0 1 0