r Counting occurences of specific number between specific number

I have a matrix where each element is eather 0 or 1. I would like to obtain the frequencies of consecutive occurences of 0's in each row, given the last 0 in the sequence is followed by a 1.

For example:

A row with: 0, 1 , 0, 1, 0, 0

The expected result should be:

Consecutive 0's of length: 1

Frequency : 2

Another row with: 0, 1, 0, 0, 1, 0, 0, 0, 1

The expected result:

Consecutive 0's of length: 1 2 3

Frequency: 1 1 1

A further objective is then to sum the frequencies of the same length in order to know how many times a single 0 was followed by a 1, two consecutive 0's where followed by a 1 etc.

Here is an exemplary matrix on which I would like to apply the routine:

m = matrix( c(1, 0, 1, 1, 1, 1, 0, 0, 0,  0,
      1, 1, 1, 1, 0, 1, 0, 0, 0,  0,
      1, 0, 0, 0, 1, 1, 1, 0, 0,  0,
      0,  1, 0, 0, 0, 0, 0, 1, 1, 1,
      1, 1, 1, 0, 0, 0, 0, 0, 1,  0,
      1, 0, 0, 0, 0, 0, 1, 1, 0,  0),

      ncol = 10, nrow = 6, byrow=TRUE)

The expected result should then be like the matrix below:

result = matrix( c(3, 0, 1, 0, 3, 0, 0, 0, 0, 0), ncol=10, nrow=1)
colnames(result) <- c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10")

Where the column names are the lengths of consecutive 0's (followed by a 1) and the matrix entries the corresponding frequencies.

Note that I have a very large data matrix, and if possible I'd like to avoid loops. Thanks for any hints, comments and propositions.

Solution

Using base functions. The complication is you are discarding trailing zeros that do not end with 1.

Explanation in line.

set.seed(13L)
numRows <-  10e4
numCols <- 10
m <- matrix(sample(c(0L, 1L), numRows*numCols, replace=TRUE),
    byrow=TRUE, ncol = numCols, nrow = numRows)
#add boundary conditions of all zeros and all ones
m <- rbind(rep(0L, numCols), rep(1L, numCols), m)
#head(m)

rStart <- Sys.time()
lens <- unlist(apply(m, 1, function(x) {
    #find the position of the last 1 while handling boundary condition of all zeros
    idx <- which(x==1)
    endidx <- if (length(idx) == 0) length(x) else max(idx)
    beginidx <- if(length(idx)==0) 1 else min(idx)

    #tabulate the frequencies of running 0s.
    runlen <- rle(x[beginidx:endidx])
    list(table(runlen$lengths[runlen$values==0]))
}))

#tabulating results
res <- aggregate(lens, list(names(lens)), FUN=sum)
ans <- setNames(res$x[match(1:ncol(m), res$Group.1)], 1:ncol(m))
ans[is.na(ans)] <- 0
ans
#     1      2      3      4      5      6      7      8      9     10 
#100108  43559  18593   7834   3177   1175    387    103      0    106 

rEnd <- Sys.time()
print(paste0(round(rEnd - rStart, 2), attr(rEnd - rStart, "units")))
#[1] "27.67secs"

Do let me know the performance after running on the large matrix.