Search code examples
rmultiple-columnsgreatest-n-per-groupsummarize

How to get max of a multiple columns based on other columns in R


I have a dataframe like this.

Name Value1 Numbr
Amy 458 18652
Tom 6785 94632
Rex 9846 44967
Tom 1095 295633
Rex 3782 58362
Amy 1956 56280
Rex 8634 84592

I need to summarize this dataframe by name, get the maximum value of each two numerical column. Finally it should be a new dataframe. That is I want a table like this.

Name Value1 Numbr
Amy 1956 56280
Rex 9846 84592
Tom 6785 295633

Please help!


Solution

  • We can group by Name and then summarise applying max

    df %>% 
      group_by(Name) %>% 
      summarise(Value1 = max(Value1),
                Numbr = max(Numbr))
    # A tibble: 3 × 3
      Name  Value1  Numbr
      <chr>  <int>  <int>
    1 Amy     1956  56280
    2 Rex     9846  84592
    3 Tom     6785 295633
    

    Using R base aggregate

    aggregate(cbind(Value1, Numbr) ~ Name, max, data=df)
    
      Name Value1  Numbr
    1  Amy   1956  56280
    2  Rex   9846  84592
    3  Tom   6785 295633