What code should I use if I want to take all the columns in Row 2 and substract them by the values in Row 1. I'd like to do this down the entire dataset. That is, row 4 - row 3, row 6 - row 5, and so on.
Data
> mtcars[1:3, ]
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Using base R
as.data.frame(lapply(mtcars[1:3, ], function(x) c(NA, diff(x))))
mpg cyl disp hp drat wt qsec vs am gear carb
1 NA NA NA NA NA NA NA NA NA NA NA
2 0.0 0 0 0 0.00 0.255 0.56 0 0 0 0
3 1.8 -2 -52 -17 -0.05 -0.555 1.59 1 0 0 -3
With dplyr:
library(dplyr)
Option 1
> mutate(mtcars[1:3, ], across(everything(), ~. - lag(.)))
mpg cyl disp hp drat wt qsec vs am gear carb
1 NA NA NA NA NA NA NA NA NA NA NA
2 0.0 0 0 0 0.00 0.255 0.56 0 0 0 0
3 1.8 -2 -52 -17 -0.05 -0.555 1.59 1 0 0 -3
Option 2
> mutate_all(mtcars[1:3, ], ~. - lag(.))
mpg cyl disp hp drat wt qsec vs am gear carb
1 NA NA NA NA NA NA NA NA NA NA NA
2 0.0 0 0 0 0.00 0.255 0.56 0 0 0 0
3 1.8 -2 -52 -17 -0.05 -0.555 1.59 1 0 0 -3
In this code, ~. - lag(.)
is an anonymous function just like function(x) x - lag(x)
.
In case your data contains non numeric variables, to do this without errors you can use mutate_if(your_data, is.numeric, ~. - lag(.))
or a variant with the new approach across()
.