Search code examples
rdataframecountcomparemissing-data

Count number of instances where a blank exists in one of 2 columns + r


I need to test a dataframe for completion of records - for the records to be complete an entry must be made in both columns. In in the example df below you will see that 2 of the 9 entries contain blanks in one of the rows.

df <- data.frame(a = c(1,2,"",4,5,6,7,"",8,9),
                 b = c(9,5,2,7,5,"",3,"",6,8))

The desired output would be a count of 2 or 7 identifying the number of records that are complete or conversely incomplete. Instances where both are blanks would not be considered in the count.


Solution

  • It could be sum of logical matrix - convert the 'df' to logical matrix (df != ''), get the rowSums and check whether the sum (TRUE -> 1 and FALSE -> 0) is equal to number of columns to return n1 and subtract from the total number of rows to get n2

    n1 <- sum(rowSums(df != '') == ncol(df))
    n1
    [1] 7
    n2 <- nrow(df) - n1
    # or if there are some other cases with `NA` etc
    n2 <- sum(rowSums(df =='') == 1, na.rm = TRUE)