Search code examples
rdataframesubsetreshape

Find the maximum value along with its group ids for each column


I have the following dataframe, df1:

    ID   Group    Time1    Time2    Time3
A00194       1    0.733    0.777    0.433
A00195       1    0.903    0.116    0.308
A00198       1    0.422    0.863    0.220
A00199       1    0.485    0.846    0.203
A02111       2    0.682    0.522    0.700
A02114       2    0.699    0.208    0.686
A02116       2    0.911    0.802    0.041
A02197       2    0.083    0.082    0.900

I would like to get ID and Group with the highest value in each Time1:Time3.

Desired Outpue will like :

    ID   Group   Value   Test
A02116       2   0.911  Time1
A00198       1   0.863  Time2
A02197       2   0.900  Time3

I tried the following code, but this requires me to do it three time to get the desired ouptut.

df1[which.max(df1$Time1),,c(1:4)]

How can I achieve this?


Solution

  • A long format data requires to find a maximum value of each Time.

    You could do with dplyr and tidyr.

    library(dplyr)
    library(tidyr)
    df |> 
      pivot_longer(contains("Time"),names_to = "Test") |> 
      filter(value == max(value),.by=Test) |> 
      arrange(Test)
    

    output

     ID     Group Test  value
      <chr>  <dbl> <chr> <dbl>
    1 A02116     2 Time1 0.911
    2 A00198     1 Time2 0.863
    3 A02197     2 Time3 0.9  
    

    Or using data.table.

    df_melt = melt(df,
         id.vars = c("ID","Group"),
         variable.name = "Test")
    
    df_melt[df_melt[,.I[which.max(value)],by=Test]$V1]
    

    output

           ID Group   Test value
    1: A02116     2  Time1 0.911
    2: A00198     1  Time2 0.863
    3: A02197     2  Time3 0.900