Search code examples
rlistdplyr

Using glue on single rows in grouped dataframe in R


I have this very simple dataframe in R:

df = data.frame(
  class=c("a", "a", "b")
)

Now I want to check if a group has a size larger than one and based on that information create a new column called e.g. class_2 like this:

df %>% 
  group_by(class) %>%
  mutate(class_2 = if_else(n() > 1, glue("{class}_{row_number()}"), class))

However, I get the error:

Error in `mutate()`:
ℹ In argument: `class_2 = if_else(n() > 1, glue("{class}_{row_number()}"), class)`.
ℹ In group 1: `class = "a"`.
Caused by error in `if_else()`:
! `true` must have size 1, not size 2.
Run `rlang::last_trace()` to see where the error occurred.

I understand the errormessage itself. I am just not really understanding why that is. Is it because glue is using all the values from the currently grouped data?

Also what would be an easy fix? I could create a new column ungroup the data and compute the column like this:

df %>% 
  group_by(class) %>%
  mutate(n = n()) %>% 
  ungroup() %>% 
  mutate(class_2 = if_else(n > 1, glue("{class}_{row_number()}"), class))

But is this the best option? It also is different, because I just want the rownumbers from the currently grouped data and not the entire dataframe.


Solution

  • The problem is that the test in if_else (i.e. n() > 1) resolves to a single value TRUE (i.e logical vector of length 1), so the true and false expressions should have length 1 as well. But in this example, for group 'a', row_number() and class both resolve to vectors on length 2.

    You can get around this by using if( ) { } else { } instead of if_else().

    
    df %>% 
      group_by(class) %>%
      mutate(class_2 = if(n() > 1) { glue("{class}_{row_number()}")} else { class})