Search code examples
rnasummary

force summary() to report the number of NA's even if none


I have many numeric vectors, some have NA's, some don't. Here is an example with two vectors:

x1 <- c(1,2,3,2,2,4)
summary(x1)
 Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
1.000   2.000   2.000   2.333   2.750   4.000 

x2 <- c(1,2,3,2,2,4,NA)
summary(x2)
 Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
1.000   2.000   2.000   2.333   2.750   4.000       1 

In the end, I want to rbind all the summary's:

rbind(summary(x1), summary(x2))
     Min. 1st Qu. Median  Mean 3rd Qu. Max. NA's
[1,]    1       2      2 2.333    2.75    4    1
[2,]    1       2      2 2.333    2.75    4    1
Warning message:
In rbind(summary(x1), summary(x2)) :
  number of columns of result is not a multiple of vector length (arg 1)

Is there a way to force summary to count NA's without error nor warning?

All my trials failed:

summary(x1, na.rm=FALSE)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  1.000   2.000   2.000   2.333   2.750   4.000 
summary(x1, useNA="always")
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  1.000   2.000   2.000   2.333   2.750   4.000 
summary(addNA(x1))
   1    2    3    4 <NA> 
   1    3    1    1    0 

I also tried the following, but it is a bit of a hack:

tmp <- rbind(summary(x1[complete.cases(x1)]), summary(x2[complete.cases(x2)]))
tmp <- cbind(tmp, c(sum(is.na(x1)), sum(is.na(x2))))
colnames(tmp)[ncol(tmp)] <- "NA's"
tmp
     Min. 1st Qu. Median  Mean 3rd Qu. Max. NA's
[1,]    1       2      2 2.333    2.75    4    0
[2,]    1       2      2 2.333    2.75    4    1

Solution

  • I have not found a way to force summary to display NA's. However, you could write a custom function that returns what you want:

    my_summary <- function(v){
      if(!any(is.na(v))){
        res <- c(summary(v),"NA's"=0)
      } else{
        res <- summary(v)
      }
      return(res)
    }