Search code examples
rdataframedplyr

How to create a boolean column that collectively responds to another column based on an ID column (using dplyr)?


I am trying to create a table from this:

ID Condition
1 B
1 B
1 A
2 B
2 B

To this:

ID Condition Flag
1 B 1
1 B 1
1 A 1
2 B 0
2 B 0

i.e. If the condition is A within any rows of the same ID, the flag would return 1 for all rows of the same ID.

I have tried something like this:

library(dplyr)

data %>%
   group_by(ID) %>%
   mutate(Flag = ifelse(Condition == 1, 1, 0)

But it just returns this:

ID Condition Flag
1 B 0
1 B 0
1 A 1
2 B 0
2 B 0

Sorry if this is unclear, if there is anything else you need from me for this question please let me know. Thanks.


Solution

  • ifelse is element-wise if-else statement, say, vectorized, which cannot give the desired output.

    You can use if ... else ... instead of ifelse

    > df %>% mutate(Flag = if (any(Condition == "A")) 1 else 0, .by = ID)
      ID Condition Flag
    1  1         B    1
    2  1         B    1
    3  1         A    1
    4  2         B    0
    5  2         B    0
    

    Or you can try %in%

    > df %>% mutate(Flag = +("A" %in% Condition), .by = ID)
      ID Condition Flag
    1  1         B    1
    2  1         B    1
    3  1         A    1
    4  2         B    0
    5  2         B    0