I am trying to add the values in column C based on unique values in column B.For instance,for B = 1,I would like to add all rows in column C i.e. 5+4+3=12.
A B C
1 1 5
2 1 4
3 1 3
4 2 1
5 2 3
for(i in unique(df$B)){
df$D = sum(df$C)
}
Also, I would like to add the number of times each data in column B occurs.
Solution :
A B C D E
1 1 5 12 3
2 1 4 12 3
3 1 3 12 3
4 2 1 4 2
5 2 3 4 2
example from my task :
docIdx newsgroup_ID freq
1 1 768
2 1 125
3 1 29
4 1 51
5 1 198
6 1 34
7 1 64
8 2 35
9 2 70
10 2 45
In base R you could use ave
df[, c("D", "E")] <- with(df, sapply(c(sum, length), function(x) ave(C, B, FUN = x)))
df
# A B C D E
#1 1 1 5 12 3
#2 2 1 4 12 3
#3 3 1 3 12 3
#4 4 2 1 4 2
#5 5 2 3 4 2
Or using dplyr
library(dplyr)
df <- df %>%
group_by(B) %>%
mutate(D = sum(C), E = length(C))
df
## A tibble: 5 x 5
## Groups: B [2]
# A B C D E
# <int> <int> <int> <int> <int>
#1 1 1 5 12 3
#2 2 1 4 12 3
#3 3 1 3 12 3
#4 4 2 1 4 2
#5 5 2 3 4 2
df <- read.table(text =
"A B C
1 1 5
2 1 4
3 1 3
4 2 1
5 2 3", header = T)
It works just fine with your revised data
df <- read.table(text =
"docIdx newsgroup_ID freq
1 1 768
2 1 125
3 1 29
4 1 51
5 1 198
6 1 34
7 1 64
8 2 35
9 2 70
10 2 45", header = T)
df[, c("sum.freq", "length.freq")] <- with(df, sapply(c(sum, length), function(x)
ave(freq, newsgroup_ID, FUN = x)))
# docIdx newsgroup_ID freq sum.freq length.freq
#1 1 1 768 1269 7
#2 2 1 125 1269 7
#3 3 1 29 1269 7
#4 4 1 51 1269 7
#5 5 1 198 1269 7
#6 6 1 34 1269 7
#7 7 1 64 1269 7
#8 8 2 35 150 3
#9 9 2 70 150 3
#10 10 2 45 150 3
Here ave(freq, newsgroup_ID, FUN = x)
applies function x
to freq
by newsgroup_ID
.