Search code examples
rdataframeplyrsummarization

R. summarizing data without merge


I have a dataframe (df) of goals scored against various teams by date

gamedate teamID Gls
 1992-08-22  CHL  3
 1992-08-22  MNU  1
 1992-08-23  ARS  0
 1992-08-23  LIV  2
 1992-08-24  MNU  0
 1992-08-25  LIV  2
 1992-08-26  ARS  0
 1992-08-26  CHL  0

I wish to produce a summary table which shows the number of games played and number of games these teams have blanked the opposition on each date

gamedate   games blanks
 1992-08-22   2     0
 1992-08-23   2     1
 1992-08-24   1     1
 1992-08-25   1     0
 1992-08-26   2     2

I can get the games and blanks separately using ddply

df.a <- ddply(df,"gamedate",function(x) c(count=nrow(x)))
df.b <- ddply(subset(df,Gls==0),"gamedate",function(x) c(count=nrow(x)))

and then merger df.a and df.b to get my answer. However, I am sure there must be a more simple and elegant solution


Solution

  • You just need to use summarise:

    Read the data in:

       dat <- read.table(textConnection("gamedate teamID Gls
      1992-08-22  CHL  3
      1992-08-22  MNU  1
      1992-08-23  ARS  0
      1992-08-23  LIV  2
      1992-08-24  MNU  0
      1992-08-25  LIV  2
      1992-08-26  ARS  0
      1992-08-26  CHL  0"),sep = "",header = TRUE)
    

    and then call ddply:

    ddply(dat,.(gamedate),summarise,tot = length(teamID),blanks = length(which(Gls == 0)))
        gamedate tot blanks
    1 1992-08-22   2      0
    2 1992-08-23   2      1
    3 1992-08-24   1      1
    4 1992-08-25   1      0
    5 1992-08-26   2      2