Search code examples
rdplyrfrequency

R: Find the unique count (frequency) of a column by patient per treatment group


I have a dataset patient data. I want to get the frequency by col3 per unique col1 across col2. Col2 is a treatment group and need to derive the unique counts of type of injection present in col 1 per patient in COL3. Need help with R code to calculate the same.


Solution

  • Quite unclear, but I guess something like this?

    library(tidyverse)
    
    df <- tibble(
      COL1 = paste("Type", sample(LETTERS[1:10], 100, replace = TRUE)),
      COL2 = paste("Injection", sample(1:10, 100, replace = TRUE)),
      COL3 = paste("Patient", sample(1:10, 100, replace = TRUE))
    ) %>% 
      relocate(COL3)
    
    df %>% 
      count(COL3, COL1, COL2)
    
    # A tibble: 98 × 4
       COL3      COL1   COL2            n
       <chr>     <chr>  <chr>       <int>
     1 Patient 1 Type B Injection 2     1
     2 Patient 1 Type B Injection 8     1
     3 Patient 1 Type C Injection 6     1
     4 Patient 1 Type C Injection 9     1
     5 Patient 1 Type D Injection 9     1
     6 Patient 1 Type E Injection 6     1
     7 Patient 1 Type F Injection 8     1
     8 Patient 1 Type G Injection 1     1
     9 Patient 1 Type G Injection 6     1
    10 Patient 1 Type I Injection 1     1
    # … with 88 more rows