R Count Sequentially Columns

    set.seed(0)
data = data.frame(ID = 1:1000, X1=runif(1000), X2=runif(1000), DROP1=sample(0:1,r=T),DROP2=sample(0:1,r=T),DROP3=sample(0:1,r=T))

Say this is my data. I wish to do this: count the number of values of DROP1 that equals to 1; then count the number of values of DROP2 among the cases when DROP1 equals to 1; then count the number of values of DROP3 equals to 1 among the cases when DROP2 equals to 1 and DROP1 equals to 1. I can do this manually but our real data file is large and has 80+ DROP variables. The desirable output is just a print out that looks:

DROP1, #
DROP2 (AFTER DROP1), #
DROP3 (AFTER DROP1 & DROP2), #

Solution

Here is an option with base R where we get the 'DROP' column names ('nm1') using grep. Then loop over the sequence of those, get the seq of those, subset the data columns, use Reduce to get a logical vector with & (only TRUE if we have all the columns with 1 for a row, i.e 1=> TRUE, 0 => FALSE), and get the sum of those elements to return the count

nm1 <- grep('^DROP', names(data), value = TRUE)
sapply(seq_along(nm1), function(i)  {i1 <- seq(i)
        sum(Reduce(`&`, data[nm1[i1]])) })
#[1] 503 249 137

Or with data.table

library(data.table)
setDT(data)
lapply(seq_along(nm1), function(i) {
         i1 <- seq(i)
         data[, sum(Reduce(`&`, .SD)), .SDcols = nm1[i1]]

    })

data

set.seed(0)
data <- data.frame(ID = 1:1000, X1=runif(1000), X2=runif(1000), 
           DROP1=sample(0:1,1000, replace = TRUE),
           DROP2=sample(0:1,1000, replace = TRUE),
           DROP3=sample(0:1,1000,replace = TRUE))