Search code examples
rgroup-bygroupingsummarize

Summarise multiple Columns In R Based On Top 5 Values


I am Trying To Summarise Multiple Columns Based On The Top 5 Values Of Each Variable In R An Example Of The Data Is Below.

df

ID  A   B   C   D

A   325 68  8   8
B   308 85  2   7
B   342 99  6   2
A   439 83  9   6
A   278 60  10  2
A   367 78  14  4
C   136 59  12  5
C   259 73  11  4
B   338 79  5   6
B   461 99  3   7
D   364 73  14  4
D   238 80  3   8
A   266 54  10  10

My Current Code Looks Like This:

    df2 <- df %>% group_by(ID) %>% top_n(5, A) %>% summarise(ATop5 = mean(A))

The output in df2 displays the information which I need.

However I have multiple variables in the original data frame which I wish to run and appear in the same output as df2.

Currently I am producing a separate df for each variable and then combining into a single df via the ID column.

Missing this step would be of great help.


Solution

  • An option with summarise_at

    library(dplyr)
    df %>%
       group_by(ID) %>%
       summarise_at(vars(A:D), ~ mean(tail(sort(.), 5)))