Search code examples
rdplyrtidyr

Remove group with identical values


In the data below, I want to remove an Utterance if all the associated f values are the same. I've been using this but get an error:

library(tidyverse)
df %>%
  separate_rows(f, convert = TRUE) %>%
  group_by(Utterance) %>%
  filter(if_all(f), ~. != lead(.))

Desired result:

# A tibble: 9 × 2
  Utterance     f
  <chr>     <int>
1 B B C         2
2 B B C         2
3 B B C         3
7 A B C         1
8 A B C         2
9 A B C         3

Data:

df <- data.frame(
  Utterance = c("B B C", "A A A", "A B C"),
  f = c("2,2,3", "1,1,1", "1,2,3")
)

Solution

  • Maybe you can use n_distinct in filter.

    library(dplyr)
    
    df %>%
      separate_rows(f, convert = TRUE) %>%
      group_by(Utterance) %>%
      filter(n_distinct(f) != 1)
    
    # A tibble: 6 × 2
    # Groups:   Utterance [2]
      Utterance     f
      <chr>     <int>
    1 B B C         2
    2 B B C         2
    3 B B C         3
    4 A B C         1
    5 A B C         2
    6 A B C         3