Search code examples
rconditional-statementsapplyminimum

R: Build Apply function to find minimum of columns based on conditions in other (related) columns


With data as such below, I'm trying to reassign any of the test cols (test_A, etc.) to their corresponding time cols (time_A, etc.) if the test is true, and then find the minimum of all true test times.

     [ID] [time_A] [time_B] [time_C] [test_A] [test_B] [test_C] [min_true_time]
[1,]    1        2        3        4    FALSE     TRUE     FALSE          ?
[2,]    2       -4        5        6     TRUE     TRUE     FALSE          ?
[3,]    3        6        1       -2     TRUE     TRUE      TRUE          ?
[4,]    4       -2        3        4     TRUE    FALSE     FALSE          ?

My actual data set is quite large so my attempts at if and for loops have failed miserably. But I can't make any progress on an apply function.

And more negative time, say -2 would be considered the minimum for row 3.

Any suggestions are welcomed gladly


Solution

  • You don't give much information, but I think this does what you need. No idea if it is efficient enough, since you don't say how big your dataset actually is.

    #I assume your data is in a data.frame:
    df <- read.table(text="ID time_A time_B time_C test_A test_B test_C 
    1    1        2        3        4    FALSE     TRUE     FALSE
    2    2       -4        5        6     TRUE     TRUE     FALSE
    3    3        6        1       -2     TRUE     TRUE      TRUE
    4    4       -2        3        4     TRUE    FALSE     FALSE")
    
    
    #loop over all rows and subset column 2:4 with column 5:7, then take the mins
    df$min_true_time <- sapply(1:nrow(df), function(i) min(df[i,2:4][unlist(df[i,5:7])]))
    df
    #  ID time_A time_B time_C test_A test_B test_C min_true_time
    #1  1      2      3      4  FALSE   TRUE  FALSE             3
    #2  2     -4      5      6   TRUE   TRUE  FALSE            -4
    #3  3      6      1     -2   TRUE   TRUE   TRUE            -2
    #4  4     -2      3      4   TRUE  FALSE  FALSE            -2
    

    Another way, which might be faster (I'm not in the mood for benchmarking):

    m <- as.matrix(df[,2:4])
    m[!df[,5:7]] <- NA
    df$min_true_time <- apply(m,1,min,na.rm=TRUE)