I have a dataset and I want to summarize my data based on (let's say) the first three characters. in fact, concatenate rows which have the same 3 first letter in the column. For example:
df
title freq
ACM100 3
ACM200 2
ACM300 2
MAT11 1
MAT21 2
CMP00 3
CMP10 3
I want to summarize the database on the title of first 3 characters and count the frequency.
result:
title freq
ACM 7
MAT 3
CMP 6
Would be appreciated to help me in R.
You can use aggregate
with transform
aggregate(freq ~ title, transform(df, title = substr(title, 1, 3)), sum)
# title freq
# 1 ACM 7
# 2 CMP 6
# 3 MAT 3