dplyr: max value in a group, excluding the value in each row?

I have a data frame that looks as follows:

> df <- data_frame(g = c('A', 'A', 'B', 'B', 'B', 'C'), x = c(7, 3, 5, 9, 2, 4))
> df
Source: local data frame [6 x 2]

  g x
1 A 7
2 A 3
3 B 5
4 B 9
5 B 2
6 C 4

I know how to add a column with the maximum x value for each group g:

> df %>% group_by(g) %>% mutate(x_max = max(x))
Source: local data frame [6 x 3]
Groups: g

  g x x_max
1 A 7     7
2 A 3     7
3 B 5     9
4 B 9     9
5 B 2     9
6 C 4     4

But what I would like is to get is the maximum x value for each group g, excluding the x value in each row.

For the given example, the desired output would look like this:

Source: local data frame [6 x 3]
Groups: g

  g x x_max x_max_exclude
1 A 7     7             3
2 A 3     7             7
3 B 5     9             9
4 B 9     9             5
5 B 2     9             9
6 C 4     4            NA

I thought I might be able to use row_number() to remove particular elements and take the max of what remained, but hit warning messages and got incorrect -Inf output:

> df %>% group_by(g) %>% mutate(x_max = max(x), r = row_number(), x_max_exclude = max(x[-r]))
Source: local data frame [6 x 5]
Groups: g

  g x x_max r x_max_exclude
1 A 7     7 1          -Inf
2 A 3     7 2          -Inf
3 B 5     9 1          -Inf
4 B 9     9 2          -Inf
5 B 2     9 3          -Inf
6 C 4     4 1          -Inf
Warning messages:
1: In max(c(4, 9, 2)[-1:3]) :
  no non-missing arguments to max; returning -Inf
2: In max(c(4, 9, 2)[-1:3]) :
  no non-missing arguments to max; returning -Inf
3: In max(c(4, 9, 2)[-1:3]) :
  no non-missing arguments to max; returning -Inf

What is the most {readable, concise, efficient} way to get this output in dplyr? Any insight into why my attempt using row_number() doesn't work would also be much appreciated. Thanks for the help.

Solution

You could try:

df %>% 
  group_by(g) %>% 
  arrange(desc(x)) %>% 
  mutate(max = ifelse(x == max(x), x[2], max(x)))

Which gives:

#Source: local data frame [6 x 3]
#Groups: g
#
#  g x max
#1 A 7   3
#2 A 3   7
#3 B 9   5
#4 B 5   9
#5 B 2   9
#6 C 4  NA

Benchmark

I've tried the solutions so far on the benchmark:

df <- data.frame(g = sample(LETTERS, 10e5, replace = TRUE),
                 x = sample(1:10, 10e5, replace = TRUE))

library(microbenchmark)

mbm <- microbenchmark(
  steven = df %>% 
    group_by(g) %>% 
    arrange(desc(x)) %>% 
    mutate(max = ifelse(x == max(x), x[2], max(x))),
  eric = df %>% 
    group_by(g) %>% 
    mutate(x_max = max(x), 
           x_max2 = sort(x, decreasing = TRUE)[2], 
           x_max_exclude = ifelse(x == x_max, x_max2, x_max)) %>% 
    select(-x_max2),
  arun = setDT(df)[order(x), x_max_exclude := c(rep(x[.N], .N-1L), x[.N-1L]), by=g],
  times = 50
)

@Arun's data.table solution is the fastest:

# Unit: milliseconds
#    expr       min        lq      mean    median       uq      max neval cld
#  steven 158.58083 163.82669 197.28946 210.54179 212.1517 260.1448    50  b 
#    eric 223.37877 228.98313 262.01623 274.74702 277.1431 284.5170    50   c
#    arun  44.48639  46.17961  54.65824  47.74142  48.9884 102.3830    50 a

enter image description here