Search code examples
rdata.tablegroupingsummary

Counts of category by group


One variable in my data is a categorical variable. I would like to compute the counts per category by group. My data has the following structure:

dt <- data.table("fact"=c(2,2,1,3,3,2,3,3,3,3,2,2,1,3,3,2,3,3,3,3,2,2,1,3,3,2,3,3,3,3), "month"=rep(1:5, 6), "id"=c(rep("a", 15), rep("b", 15)))

After the operation, the data should look as follows:

dt1 <- data.table("id" = c(rep("a", 5), rep("b", 5)), "month" = rep(1:5,2), 
"fact1" = c(0,0,2,0,0,0,0,1,0,0),
"fact2" = c(3,2,0,0,0,3,1,0,0,0),
"fact3" = c(0,1,1,3,3,0,2,2,3,3))

Note that fact1, fact2, fact3 are the levels in the categorical variable fact.


Solution

  • I created a new character-vector column by pasting "fact" with the value in the fact column, and casting to wide format:

    library(data.table)
    
    dt <- data.table("fact"=c(2,2,1,3,3,2,3,3,3,3,2,2,1,3,3,2,3,3,3,3,2,2,1,3,3,2,3,3,3,3), "month"=rep(1:5, 6), "id"=c(rep("a", 15), rep("b", 15)))
    
    dt[, name := paste("fact", fact, sep="")]
    dcast(dt, id+month~name, value.var="name")
    

    yields

        id month fact1 fact2 fact3
     1:  a     1     0     3     0
     2:  a     2     0     2     1
     3:  a     3     2     0     1
     4:  a     4     0     0     3
     5:  a     5     0     0     3
     6:  b     1     0     3     0
     7:  b     2     0     1     2
     8:  b     3     1     0     2
     9:  b     4     0     0     3
    10:  b     5     0     0     3