Search code examples
rdataframedata.tabledynamic-columns

Access data.table columns through vector indexes?


i'm stucked with a problem but i can find no satisfying answers on the web. I would like to valorize a data.frame(also a data.table it's good for me) using start:end vectors. An example will clarify what i'm asking.

Suppose i have a data.framelike the following:

df <- data.frame(col_1 = rep(0, 3), col_2 = rep(0, 3), col_3 = rep(0, 3), col_4 = rep(0,3))
df
  col_1 col_2 col_3 col_4
1     0     0     0     0
2     0     0     0     0
3     0     0     0     0

And suppose i have two vectors:

indexesStart <- c(1, 2, 1)
indexesEnd   <- c(2, 4, 3)

I would like to valorize to 1 all values in the range indicated by the vectors by row. The output should be the following:

  col_1 col_2 col_3 col_4
1     1     1     0     0
2     0     1     1     1
3     1     1     1     0

I tried something like this:

df[ , indexesStart:indexesEnd] <- 1

But it doesn't work, it just takes indexesStart[1]:indexesEnd[1] and repeat it for all rows.

I must avoid loop cycles because my real data frame has millions rows and it is too slow. Any help is appreciated (a data.table solution would be even better)

Thank you


Solution

  • This will do it:

    df <- data.frame(col_1=rep(0,3),col_2=rep(0,3),col_3=rep(0,3),col_4=rep(0,3))
    indexesStart <- c(1, 2, 1)
    indexesEnd   <- c(2, 4, 3)
    
    for (i in 1:nrow(df)) df[i, indexesStart[i]:indexesEnd[i]] <- 1
    
    df
    

    Here is another technique using a twocolumn matrix as index:

    I <- do.call(rbind, lapply(1:length(indexesStart), function(i) cbind(i, indexesStart[i]:indexesEnd[i])))
    df[I] <- 1
    

    In the second variant I hided the loop (and the hidden loop is in another place).