Search code examples
rfunctionmutateacross

in R, add the value of a column (rowwise) to any column that is named as a date (and output the result as a date)


I am attempting to write what should be simple code to add the value of a data frame column to a date in another column using across() to find all columns with the suffix "_date".

example input (df):

  id xdays curation_date event_date
1  1    -4    2024-09-01 2024-07-04
2  2    10    2024-07-24 2024-06-13
3  3     7    2024-01-03 2023-12-03

expected output (df1)

  id xdays curation_date event_date manual_shift_curation_date manual_shift_event_date
1  1    -4    2024-09-01 2024-07-04                 2024-08-28              2024-06-30
2  2    10    2024-07-24 2024-06-13                 2024-08-03              2024-06-23
3  3     7    2024-01-03 2023-12-03                 2024-01-10              2023-12-10

Below is what I've tried. df is the sample dataframe. df1 is the manual calculation to show expected output.

I've spent a few hours trying things from stackoverflow but haven't found the right magical solution. I'm sure I'm doing something wrong, but can't figure out where.

#requirement - #add the value of df$xdays to any column that is named as a date (and output the result as a date)
library(tidyverse)

################functions attempted
# 1) include a parameter for the column with the date shift
dateshift <- function(x,shiftcolumn) {as.Date(as.Date(x) + {{shiftcolumn}})} #shiftcolumn is actually a constant, but this is closest to working.
# or
# 2) just call the dateshift column within the function
dateshift_nocol <- function(x) {as.Date(as.Date(x) + {{xdays}})}  

#example dataframe
df <- data.frame(id=c(1,2,3), 
                 xdays=c(-4, 10, 7),
                 curation_date=c('2024-09-01','2024-07-24','2024-01-03'),
                 event_date=c('2024-07-04','2024-06-13','2023-12-03')
)

###############manaully shift date to show expected results
df1 <- df %>% 
  mutate(manual_shift_curation_date=as.Date(as.Date(curation_date)+xdays),
         manual_shift_event_date  =as.Date(as.Date(event_date)+xdays))
#works

###############using dateshift passing the constant column as a parameter
df2 <- df %>%
  mutate(across(matches("date"), dateshift(.,shiftcolumn=xdays), .names = "shifted_{.col}")) #renamed for qc check
#error: Caused by error in `as.Date.default()`:! do not know how to convert 'x' to class “Date”

df3 <- df %>% 
  mutate(across(matches("date"), ~ apply(., 1, dateshift))) 
#error: Caused by error in `apply()`:! dim(X) must have a positive length

###################using dateshift_nocol (without passing the constant column value)
df4 <- df %>%
  mutate(across(matches("date"), dateshift_nocol, .names = "shifted_{.col}")) #renamed for qc check
# Caused by error in `across()`:! Can't compute column `shifted_curation_date`. Caused by error: ! object 'xdays' not found

df5 <- df %>% 
  mutate(across(matches("date"), ~ apply(., 1, dateshift_nocol)))
#error: Caused by error in `apply()`:! dim(X) must have a positive length



All non-manual attempts fail. See code examples and error outputs


Solution

  • Using across and .names for the variable names

    library(dplyr)
    
    df %>%
      mutate(across(ends_with("date"), ~ .x + xdays, .names="manual_shift_{.col}"))
      id xdays curation_date event_date manual_shift_curation_date
    1  1    -4    2024-09-01 2024-07-04                 2024-08-28
    2  2    10    2024-07-24 2024-06-13                 2024-08-03
    3  3     7    2024-01-03 2023-12-03                 2024-01-10
      manual_shift_event_date
    1              2024-06-30
    2              2024-06-23
    3              2023-12-10
    
    Data
    df <- structure(list(id = 1:3, xdays = c(-4L, 10L, 7L), curation_date = 
    structure(c(19967, 19928, 19725), class = "Date"), event_date =
    structure(c(19908, 19887, 19694), class = "Date")), row.names = 
    c("1", "2", "3"), class = "data.frame")