I have a data as below:
#dt
Method ID Source Amt
A 1 X 10
A 1 Y 20
C 1 Z 30
B 2 Y 15
D 2 Z 10
C 3 X 20
D 3 X 20
E 4 Z 10
E 4 Z 10
What I want is:
ID Total_Amt Method_A Method_B Method_C Method_D Method_E Source_X Source_Y Source_Z
1 60 2 0 1 0 0 1 1 1
2 25 0 1 0 1 0 0 1 1
3 40 0 0 1 1 0 2 0 0
4 20 0 0 0 0 2 0 0 2
For the Method
and Source
columns, I want to calculate the count by their ID
and use dcast
to transform to wide format and also add up Amt
column by ID
.
Any Help?
Here's one way using dplyr
and tidyr
libraries. We first calculate sum
of Amt
values for each ID
, get the data in long format, count
number of rows and get it back in wide format.
library(dplyr)
library(tidyr)
df %>%
group_by(ID) %>%
mutate(Amt = sum(Amt)) %>%
pivot_longer(cols = c(Method, Source)) %>%
count(ID, value, Amt, name) %>%
pivot_wider(names_from = c(name, value), values_from = n, values_fill = 0)
# ID Amt Method_A Method_C Source_X Source_Y Source_Z Method_B Method_D Method_E
# <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
#1 1 60 2 1 1 1 1 0 0 0
#2 2 25 0 0 0 1 1 1 1 0
#3 3 40 0 1 2 0 0 0 1 0
#4 4 20 0 0 0 0 2 0 0 2