Search code examples
rdplyrtidyversemagrittr

How to apply a function on a selection of columns in a pipe sequence in R?


I have a dataframe (or a tibble no matter) with many columns and I want to apply a function (let's say rowSums) on only 7 of them, but I don't want to get reed of the others. The trick is that I want to do so in a pipe sequence - create (or read the data) - apply the function - optional operation after that

Here is a reproductible exemple on a dataframe where I would like to rowSums on the first 3 columns

data <- data.frame("v1" = runif(10, 0, 10), "v2" = runif(10, 0 ,10), "v3" = runif(10, 0 ,10), "v4" = rep("some_charchter", 10))

the way I would usually do it is

data$sum <- rowSums(data[,1:3])

but I want something like this

data <- data.frame("v1" = runif(10, 0, 10), "v2" = runif(10, 0 ,10), "v3" = runif(10, 0 ,10), "v4" = rep("some_charchter", 10)) %>% 
  mutate(sum = rowSums())

Thanks for your help !


Solution

  • You can access your data object inside a pipe using .. Therefore mutate(sum = rowSums(.[, 1:3])) does the trick:

    data <- data.frame("v1" = runif(10, 0, 10), "v2" = runif(10, 0 ,10), "v3" = runif(10, 0 ,10), "v4" = rep("some_charchter", 10)) %>% 
      mutate(sum = rowSums(.[, 1:3]))
    
    data
             v1        v2        v3             v4       sum
    1  2.280871 0.1981815 7.5349128 some_charchter 10.013965
    2  1.250208 7.6687056 0.6193483 some_charchter  9.538262
    3  6.782954 3.6973201 2.7694021 some_charchter 13.249677
    4  3.809574 6.8641731 3.1271489 some_charchter 13.800896
    5  9.339726 4.4571677 5.4489081 some_charchter 19.245802
    6  6.623371 3.9594287 0.6025072 some_charchter 11.185307
    7  6.843193 1.3548732 3.1826649 some_charchter 11.380731
    8  2.377099 7.5661778 9.6320561 some_charchter 19.575333
    9  3.582874 2.1485691 8.2970807 some_charchter 14.028524
    10 4.565336 3.7073800 0.3355328 some_charchter  8.608248