Search code examples
arraysrapplydimension

Avoid collpasing dimensions when omitting NAs from array


I have an array where I have to omit NA values. I know that it is an array full of matrices where every row has exactly one NA value. My approach works well for >2 columns of the matrices, but apply() drops one dimension when there are only two columns (as after omitting the NA values, one column disappears). As this step is part of a much larger code, I would like to avoid recoding the rest and make this step robust to the case when the number of columns is two. Here is a simple example:

#create an array
arr1 <- array(rnorm(3000),c(500,2,3))

#randomly distribute 1 NA value per row of the array
for(i in 1:500){
arr1[i,,sample(3,1)] <- NA
}

#omit the NAs from the array
arr1.apply <- apply(arr1, c(1,2),na.omit)

#we lose no dimension as every dimension >1
dim(arr1.apply)
[1]   2 500   2


#now repeat with a 500x2x2 array

#create an array
arr2 <- array(rnorm(2000),c(500,2,2))

#randomly distribute 1 NA value per row of the array
for(i in 1:500){
  arr2[i,,sample(2,1)] <- NA
}

#omit the NAs from the array
arr2.apply <- apply(arr2, c(1,2),na.omit)

#we lose one dimension because the last dimension collapses to size 1
dim(arr2.apply)
[1] 500   2

I do not want apply() to drop the last dimension as it breaks the rest of my code.

I am aware that this is a known issue with apply(), however, I am eager to resolve the problem in this very step, so any help would be appreciated. So far I've tried to wrap apply() in an array() command using the dimensions that should result, however, I think this mixes up the values in the matrix in a way that is not desirable.

Thanks for your help.


Solution

  • I propose a stupid solution, but I think you have no choice if you want to keep it this way:

    arr1.apply <- if(dim(arr1)[3] > 2){
    apply(arr1, c(1,2),na.omit)} else{
    array(apply(arr1, c(1,2),na.omit),dim = c(1,dim(arr1)[1:2]))}