Search code examples
rgroup

R: Add count for unique values within Group, disregarding other variables within dataframe


I would like to add a new variable to my data frame, which, for each group says the number of unique entries with relation to one variable (state), while disregaring others.

Data input

df <- data.frame(id=c(1,2,3,4,5,6,7,8,9),
                 state=c("CT","CT","AK","TX","TX","AZ","GA","TX","WA"),
                 group=c(1,1,2,3,3,3,4,4,4),
                 age=c(12,33,57,98,45,67,16,85,22)
                 )
df

Desired output

want <- data.frame(id=c(1,2,3,4,5,6,7,8,9),
                 state=c("CT","CT","AK","TX","TX","AZ","GA","TX","WA"),
                 group=c(1,1,2,3,3,3,4,4,4),
                 age=c(12,33,57,98,45,67,16,85,22),
                 count=c(1,1,1,2,2,2,3,3,3)
                 )
want

Solution

  • We need a group by n_distinct

    library(dplyr)
    df %>% 
      group_by(group) %>% 
      mutate(count = n_distinct(state)) %>%
      ungroup