Search code examples
rdataframedplyrgrouping

How to count data frame elements grouped by multiple conditions in dplyr?


I am trying to use dplyr to count elements grouped by multiple conditions (columns) in a data frame. In the below example (dataframe output is at the top (except that I manually inserted the 2 right-most columns to explain what I am trying to do), and R code is underneath), I am trying to count the joint groupings of the Element and Group columns. My multiple condition grouping attempt is eleGrpCnt. Any recommendations for the correct way to do this in dplyr? I thought that group_by a combined (Element, Group) would work.

                                                   desired
       Element Group origOrder eleCnt eleGrpCnt  eleGrpCnt   explanation
       <chr>   <dbl>     <int>  <int>     <int>  <comment>   <comment>
     1 B           0         1      1         1          1   1st grouping of B where Group = 0
     2 R           0         2      1         1          1   1st grouping of R where Group = 0
     3 R           1         3      2         1          2   2nd grouping of R where Group = 1
     4 R           1         4      3         2          2   2nd grouping of R where Group = 1 
     5 B           0         5      2         2          1   1st grouping of B where Group = 0 
     6 X           2         6      1         1          1   1st grouping of X where Group = 2 
     7 X           2         7      2         2          1   1st grouping of X where Group = 2 
     8 X           0         8      3         1          2   2nd grouping of X where Group = 0
     9 X           0         9      4         2          2   2nd grouping of X where Group = 0
    10 X          -1        10      5         1          3   3rd grouping of X where Group = -1 
    
library(dplyr)

myData6 <- 
  data.frame(
    Element = c("B","R","R","R","B","X","X","X","X","X"),
    Group = c(0,0,1,1,0,2,2,0,0,-1)
  )

myData6 %>% 
  mutate(origOrder = row_number()) %>%
  group_by(Element) %>%
    mutate(eleCnt = row_number()) %>%
  ungroup() %>%
  group_by(Element, Group) %>%
    mutate(eleGrpCnt = row_number())%>%
  ungroup()

Solution

  • If you group by element then the numbers you are looking for are simply the matches of Group against the unique values of Group:

    library(dplyr)
    
    myData6 %>% 
      mutate(origOrder = row_number()) %>%
      group_by(Element) %>%
      mutate(eleCnt = row_number()) %>%
      ungroup() %>%
      group_by(Element) %>%
      mutate(eleGrpCnt = match(Group, unique(Group)))
    #> # A tibble: 10 x 5
    #> # Groups:   Element [3]
    #>    Element Group origOrder eleCnt eleGrpCnt
    #>    <chr>   <dbl>     <int>  <int>     <dbl>
    #>  1 B           0         1      1         1
    #>  2 R           0         2      1         1
    #>  3 R           1         3      2         2
    #>  4 R           1         4      3         2
    #>  5 B           0         5      2         1
    #>  6 X           2         6      1         1
    #>  7 X           2         7      2         1
    #>  8 X           0         8      3         2
    #>  9 X           0         9      4         2
    #> 10 X          -1        10      5         3
    

    Created on 2022-09-11 with reprex v2.0.2