Search code examples
rdplyrtidyverse

Count number of rows where a value appears in any of two columns in R


I have a dataset like this:

data <- read.csv(text = "foo,bar
a,b
a,a
b,c
c,a
c,b")

I want to compute a table that tells me in how many rows every possible value appears, so something like this:

Value Count
a 3
b 3
c 3

I've tried grouping by the two columns using dplyr and then summarise, but that does not give you a count per value, but per column value. Any idea?


Solution

  • With dplyr, tibble and tidyr, you could do:

    data %>%
     mutate(across(everything(), trimws)) %>%
     rowid_to_column() %>%
     pivot_longer(-rowid) %>%
     group_by(value) %>%
     summarise(n = n_distinct(rowid))
    
      value     n
      <chr> <int>
    1 a         3
    2 b         3
    3 c         3