guys
Say I have a dataset that looks like this:
id value year total_value total_frequency
1 10 2019 44 4
1 15 2019 44 4
1 12 2020 44 4
1 7 2020 44 4
2 10 2019 28 2
2 18 2020 28 2
3 27 2019 27 1
I intend to achieve a form like this:
id value year total_value total_frequency 2020_value 2020_frequency
1 10 2019 44 4 19 2
1 15 2019 44 4 19 2
1 12 2020 44 4 19 2
1 7 2020 44 4 19 2
2 10 2019 28 2 18 1
2 18 2020 28 2 18 1
3 27 2019 27 1 0 0
The repetition is necessary for future steps.
I can easily achieve this form via Excel using sumifs() and countifs(), but unable to get the same result in R.
I think I can use dplyr
package to do this but don't know exactly how, hence some advice is needed.
Group by id
and sum
the value
for the year
in 2020 and count the number of rows for it as well.
library(dplyr)
df %>%
group_by(id) %>%
mutate(value_2020 = sum(value[year %in% 2020]),
frequency_2020 = sum(year %in% 2020)) %>%
ungroup
# id value year total_value total_frequency value_2020 frequency_2020
# <int> <int> <int> <int> <int> <int> <int>
#1 1 10 2019 44 4 19 2
#2 1 15 2019 44 4 19 2
#3 1 12 2020 44 4 19 2
#4 1 7 2020 44 4 19 2
#5 2 10 2019 28 2 18 1
#6 2 18 2020 28 2 18 1
#7 3 27 2019 27 1 0 0