Search code examples
rdataframerowna

How to simply count number of rows with NAs - R


I'm trying to compute the number of rows with NA of the whole df as I'm looking to compute the % of rows with NA over the total number of rows of the df.

I have already have seen this post: Determine the number of rows with NAs but it just shows a specific range of columns.


Solution

  • tl;dr: row wise, you'll want sum(!complete.cases(DF)), or, equivalently, sum(apply(DF, 1, anyNA))

    There are a number of different ways to look at the number, proportion or position of NA values in a data frame:

    Most of these start with the logical data frame with TRUE for every NA, and FALSE everywhere else. For the base dataset airquality

    is.na(airquality)
    

    There are 44 NA values in this data set

    sum(is.na(airquality))
    # [1] 44
    

    You can look at the total number of NA values per row or column:

    head(rowSums(is.na(airquality)))
    # [1] 0 0 0 0 2 1
    colSums(is.na(airquality))
    #   Ozone Solar.R    Wind    Temp   Month     Day 
     37       7       0       0       0       0 
    

    You can use anyNA() in place of is.na() as well:

    # by row
    head(apply(airquality, 1, anyNA))
    # [1] FALSE FALSE FALSE FALSE  TRUE  TRUE
    sum(apply(airquality, 1, anyNA))
    # [1] 42
    
    
    # by column
    head(apply(airquality, 2, anyNA))
    #   Ozone Solar.R    Wind    Temp   Month     Day 
    #    TRUE    TRUE   FALSE   FALSE   FALSE   FALSE
    sum(apply(airquality, 2, anyNA))
    # [1] 2
    

    complete.cases() can be used, but only row-wise:

    sum(!complete.cases(airquality))
    # [1] 42