df<-read.table(textConnection('egg 1 20 a
egg 2 30 a
jap 3 50 b
jap 1 60 b'))
> df
V1 V2 V3 V4
1 egg 1 20 a
2 egg 2 30 a
3 jap 3 50 b
4 jap 1 60 b
My data has no factors so I convert factors to characters:
> df$V1 <- as.character(df$V1)
> df$V4 <- as.character(df$V4)
I would like to "collapse" the data frame by V1 keeping:
Please note this is a general question, e.g. my dataset is much larger and I may want to use different functions (e.g. last, first, min, max, variance, st. dev., etc for different variables) when collapsing. Hence the functions argument could be quite long.
In this case I would want output of the form:
> df.collapse
V1 V2 V3 V4
1 egg 2 25 a
2 jap 3 55 b
plyr package will help you:
ddply(df, .(V1), summarize, V2 = max(V2), V3 = mean(V3), V4 = toupper(V4)[1])
As R does not have mode function (probably), I put other function. But it is easy to implement a mode function.