Search code examples
rdplyrfrequency

Obtain a row sum based on a condition in R


I am scoring a PES-brief scale at work for a study. One of the scales requires a frequency of event: if the participant scored 1-3, +1. If 0, then +0. I need to obtain this score for each person.

EDIT: There are additional rows that I do NOT want to add. I don't want to sum 'dontadd'

Here is my dataframe sample:

pesb0101 <- c(1,2,3,0,1,0,3,2,1,0)
pesb0102 <- c(1,1,0,0,3,2,3,2,1,0)
pesb0103 <- c(1,2,3,2,1,0,1,0,1,1)
df <- data.frame(pesb0101,pesb0102,pesb0103)
rownames(df) <- c('person1','person2','person3','person4','person5','person6','person7','person8','person9','person10')

df
         pesb0101 pesb0102 pesb0103 dontadd
person1         1        1        1       1
person2         2        1        2       2
person3         3        0        3       3
person4         0        0        2       4
person5         1        3        1       5
person6         0        2        0       3
person7         3        3        1       9
person8         2        2        0       2
person9         1        1        1       1
person10        0        0        1       2

I need a score column for each person of a sum where if the score is NOT 0, +1. So my dataframe should be:

         > df
         pesb0101 pesb0102 pesb0103 dontadd pesbScore
person1         1        1        1       1         3
person2         2        1        2       2         3
person3         3        0        3       3         2
person4         0        0        2       4         1
person5         1        3        1       5         3
person6         0        2        0       3         1
person7         3        3        1       9         3
person8         2        2        0       2         2
person9         1        1        1       1         3
person10        0        0        1       2         1

I've tried a few different methods (rowSums mostly) and I think I'm probably missing something simple.


Solution

  • apply() can run a function for each row of a dataframe. If you make a simple function to score the way you want, apply can do the rest:

    score_counter <- function(row) {
      sum(row != 0)
    }
    
    # first make a new data frame with just the columns you want to add
    df_pesb = df[, grepl("pesb", names(df))]
    # use the new data frame to count a score for each row
    df$pesbScore = apply(df_pesb, 1, score_counter)