I'm trying to group my data by a number of variables before providing a summary table showing the sum of the values within each group.
I have created the below data as an example.
Value <- c(21000,10000,50000,60000,2000, 4000, 5500, 10000, 35000, 40000)
Group <- c("A", "A", "B", "B", "C", "C", "A", "A", "B", "C")
Type <- c(1, 2, 1, 2, 1, 1, 1, 2, 2, 1)
Matrix <- cbind(Value, Group, Type)
I want to group the above data first by the 'Group' variable, and then by the 'Type' variable to then sum the values and get an output similar to the attached example I worked on Excel. I would usually use the aggregate function if I just wanted to group by one variable, but am not sure whether I can translate this for multiple variables?
Further to this I then need to provide an identical table but with the values being calculated with a "count" function rather than a "sum".
Many thanks in advance!
You can supply multiple groupings to aggregate
:
df <- data.frame(Value, Group, Type)
> aggregate(df$Value, list(Type = df$Type, Group = df$Group), sum)
Type Group x
1 1 A 26500
2 2 A 20000
3 1 B 50000
4 2 B 95000
5 1 C 46000
> aggregate(df$Value, list(Type = df$Type, Group = df$Group), length)
Type Group x
1 1 A 2
2 2 A 2
3 1 B 1
4 2 B 2
5 1 C 3
There are other packages which may be easier to use such as data.table
:
>library(data.table)
>dt <- as.data.table(df)
>dt[, .(Count = length(Value), Sum = sum(Value)),
by = .(Type, Group)]
Type Group Count Sum
1: 1 A 2 26500
2: 2 A 2 20000
3: 1 B 1 50000
4: 2 B 2 95000
5: 1 C 3 46000
dplyr
is another option and @waskuf has good example of that.