I have column A, which is a date and B which is a string value.
A | B | |
---|---|---|
1 | 01/02/23 | apple |
2 | 01/02/23 | pear |
3 | 01/02/23 | apple |
4 | 02/02/23 | apple |
3 | 02/02/23 | apple |
3 | 02/02/23 | banana |
3 | 02/02/23 | banana |
3 | 02/02/23 | apple |
I would like the output to be a list on unique dates, with cumulation day on day of the string values in B.
Like so
A | B | |
---|---|---|
1 | 01/02/23 | apple |
pear | ||
1 | 02/02/23 | apple |
pear | ||
banana |
Thank you in advance!!!!
I've tried basic cumulations but cant seem to get to the desired result when i use string value not numbers.
Update removed first answer: We can do it using accumulate
together with union
and some data tweaking:
library(dplyr)
library(tidyr)
df %>%
arrange(A) %>%
mutate(cumulative = accumulate(B, ~union(.x, .y))) %>%
reframe(B = last(cumulative), .by=A) %>%
unnest(B) %>%
mutate(A = ifelse(row_number() != 1, "", A), .by=A)
A B
1 1 apple
2 pear
3 2 apple
4 pear
5 banana