I am working in R and have a dataframe with various series. I need to perform on most of these columns the following two operations:
I tried this solution for one column
df %>% mutate( pmax(0,x - pmax(lag(x), lag(x,2), lag(x,3), lag(x,4))) )
but I guess it’s possible to do this for all columns and for both operations using dplyr’s across and purrr syntax. Any ideas on how to do this?
You could use the across()
function in the dplyr package.
#Define some test data
df <- data.frame( x= round(runif(100, 10, 15)), y=round(rnorm(100, 10, 5), 1))
#define the function to apply on each column
mypmax <- function(i){
pmax(0,i - pmax(lag(i), lag(i,2), lag(i,3), lag(i,4)))
}
#apply the function on columns 1 & 2.
#create new column names to store the results.
df %>% mutate(across(c(1,2), mypmax, .names = "new_{.col}" ) )
x y new_x new_y
1 12 7.3 NA NA
2 14 10.9 NA NA
3 10 17.8 NA NA
4 14 12.5 NA NA
5 15 10.0 1 0.0
6 14 11.6 0 0.0
7 10 7.9 0 0.0
8 12 8.6 0 0.0
9 11 11.3 0 0.0
10 11 4.7 0 0.0