Search code examples
rdplyrsummarize

is there a simple way to add the dplyr summarize function result to every row?


The following is a simple version of my data:

sample dataset

I want to create a flag for each group if they at least have one item in Column1. I know I can do this in dplyr and then merge it with my original data but I was wondering if there is an easier way.

For example, I can do this:

df_column <- df %>% filter(!is.na(Column1)) %>% group_by(Group)%>%
  summarize(n=n_distinct(Column1))

and then I can merge this with the original data and create a flag.


Solution

  • Without filtering, we can do this with mutate by creating a logical column based on the number of unique elements (n_distinct) in 'Column1' after groupingby 'Group'

    library(dplyr)
    df %>%
         group_by(Group) %>%
         mutate(flag = n_distinct(Column1[!is.na(Column1)]) > 1)