Search code examples
rdata.tablegroup

Only keep the minimum value of each group


I have the following data.table:-

> dataz <- data.table(group = c("ZAS", "Car", rep("EEE", times = 3), rep("EEff", times = 2), rep("2133", times = 6), "EETTE"),
                    value = runif(14))
> dataz

    group      value
 1:   ZAS 0.27218511
 2:   Car 0.39520602
 3:   EEE 0.46775956
 4:   EEE 0.55071786
 5:   EEE 0.37529203
 6:  EEff 0.01471177
 7:  EEff 0.86282569
 8:  2133 0.20789336
 9:  2133 0.91272858
10:  2133 0.06315207
11:  2133 0.18178237
12:  2133 0.42354538
13:  2133 0.10176267
14: EETTE 0.88492458

I want to keep only those rows which have minimum value of each group.

The final data.table will be of the following form:-

    group      value
 1:   ZAS 0.27218511
 2:   Car 0.39520602
 3:   EEE 0.37529203
 4:  EEff 0.01471177
 5:  2133 0.06315207
 6: EETTE 0.88492458

Solution

  • With .SD:

    dataz[,.SD[value==min(value)],by=.(group)]
        group      value
       <char>      <num>
    1:    ZAS 0.39590814
    2:    Car 0.42591138
    3:    EEE 0.07049145
    4:   EEff 0.34670793
    5:   2133 0.05702904
    6:  EETTE 0.31071582