Search code examples
rdplyr

How can I keep columns when grouping/summarizing?


So, the problem for this question is, I cannot post actual code because of an agreement I had to sign and I'm new at R and probably unable to explain that well, , but maybe someone can help me anyway...

Let's say I have some data:

A   B    C   D
F1  6.6  10  10
F1  3.1  10  10
A1  1.0  20  10
B1  3.4  20  20

So, for every A, the C and D values are the same. But I want to use dplyr to find Bmean like so:

A    Bmean   C    D
F1   4,85    10  10
A1   1.0     20  10
B1   3.4     20  20

How would I do that? My idea was to use something like

dplyr::group_by(A) %>% dplyr::summarize(Bmean = mean(B))

but C and D seem to disappear after this operation. Would it make sense to group_by all columns I want to keep? Or how would that work?

Just to clarify, I would like to use the dplyr syntax, since it's part of a bigger operation, if possible.


Solution

  • You can do this using base R

    aggregate(data=df1,B~.,FUN = mean)