Search code examples
rsumr-factor

R summing factors


I split some data according to factors like:

 a <- factor(data$fact)
 b <- split(data,a)

Now I would like to add some of the factor together e.g.

tot <- b$A+b$B

but I'm getting the following error,

"sum" not meaningful for factors

Any help would be great.

aaa val1 val2 ...
aaa
bbb
bbb
ccc
ccc

Now if I split into factors I have three. But I want for example aaa and ccc to be considered together. This meas that the value in the other column should be summed up.

Thanks


Solution

  • Create a new factor variable before splitting:

    # Make up some data
    df = data.frame(Cases = sample(LETTERS[1:5], 10, replace=TRUE),
                    Set1 = 1:10, Set2 = 11:20)
    # Duplicate your cases column
    df$Cases_2 = df$Cases
    # Create a new set of factor levels
    levels(df$Cases_2) <- ifelse(levels(df$Cases_2) %in% c("A","B"), 
                                 "AB", levels(df$Cases_2))
    temp = split(df[-c(1, 4)], df$Cases_2)
    temp
    # $AB
    #   Set1 Set2
    # 3    3   13
    # 5    5   15
    # 6    6   16
    # 8    8   18
    # 
    # $C
    #   Set1 Set2
    # 4    4   14
    # 9    9   19
    # 
    # $D
    #    Set1 Set2
    # 2     2   12
    # 7     7   17
    # 10   10   20
    # 
    # $E
    #   Set1 Set2
    # 1    1   11
    

    Then use lapply to calculate colSums:

    lapply(temp, colSums)
    # $AB
    # Set1 Set2 
    #   22   62 
    # 
    # $C
    # Set1 Set2 
    #   13   33 
    # 
    # $D
    # Set1 Set2 
    #   19   49 
    # 
    # $E
    # Set1 Set2 
    #    1   11