Search code examples
rdataframedplyrtidyverserowsum

Simple way to pipe rowsums


I have the following data frame (showing a head sample)

dput(sample)
structure(list(VR1 = c(NA, NA, 1L, NA, 0L, NA), VR2 = c(NA, 
NA, 
NA, NA, NA, NA), VR3 = c(NA, NA, 0L, NA, 0L, NA), VR4 = c(NA, 
NA, 1L, NA, 0L, NA), VR5 = c(NA, NA, 1L, NA, 1L, NA), VR6 = 
c(NA, 
NA, 0L, NA, 0L, NA), VR7 = c(NA, NA, 1L, NA, 0L, NA), VR8 = 
c(NA, 
NA, 0L, NA, 0L, NA), VR9 = c(NA, NA, 1L, NA, 1L, NA), VR10 = 
c(NA, 
NA, 1L, NA, 0L, NA), VR11 = c(NA, NA, 0L, NA, 0L, NA), VR12 = 
c(NA, 
NA, 0L, NA, 0L, NA), VR13 = c(NA, NA, 1L, NA, 0L, NA), VR14 = 
c(NA, 
NA, 1L, NA, 0L, NA), VR15 = c(NA, NA, 1L, NA, 1L, NA), VR16 = 
c(NA, 
NA, 0L, NA, 0L, NA), VR17 = c(NA, NA, 1L, NA, 0L, NA), VR18 = 
c(NA, 
NA, 1L, NA, 1L, NA), VR19 = c(NA, NA, 1L, NA, 0L, NA), VR20 = 
c(NA, 
NA, 1L, NA, 0L, NA)), row.names = c(NA, 6L), class = 
"data.frame")

I am doing a lot of previous manipulation (such as deleting columns), but I do not find a function to pipe simple rowsums into a new column. Here is what I have been triying sample <- sample %>% mutate(total = rowSums(1:20))

I keep finding on the internet sum(c_across

but R does not recognize it, despite loading tidyverse and dplyr


Solution

  • A dplyr option would c_across() but it is needed a row id:

    library(dplyr)
    #Code
    sample %>% mutate(id=1:n())%>%
      rowwise(id) %>%
      mutate(total=sum(c_across(VR1:VR20),na.rm=T))
    

    Output:

    # A tibble: 6 x 22
    # Rowwise:  id
        VR1 VR2     VR3   VR4   VR5   VR6   VR7   VR8   VR9  VR10  VR11  VR12  VR13  VR14  VR15  VR16
      <int> <lgl> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
    1    NA NA       NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
    2    NA NA       NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
    3     1 NA        0     1     1     0     1     0     1     1     0     0     1     1     1     0
    4    NA NA       NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
    5     0 NA        0     0     1     0     0     0     1     0     0     0     0     0     1     0
    6    NA NA       NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
    # ... with 6 more variables: VR17 <int>, VR18 <int>, VR19 <int>, VR20 <int>, id <int>, total <int>
    

    The data used was the dput(sample) you shared.