I have days and reference date that I want to use to get the correct date using SparkR. Here is a toy data and code:
library(magrittr)
library(SparkR)
df <- tibble::tribble(
~days, ~date,
17000L, "1970-01-01",
17200L, "1970-01-01")
df_spark <- SparkR::as.DataFrame(df)
This works:
df_spark <- df_spark %>%
SparkR::mutate(date2 = date_add(to_date(df_spark$date), 17000))
But, this doesn't.
df_spark <- df_spark %>%
SparkR::mutate(date2 = date_add(to_date(df_spark$date), df_spark$days))
It throws an error:
unable to find an inherited method for function ‘date_add’ for signature ‘"Column", "Column"’
I want to be able to provide column "days" as 2nd argument to date_add instead of number as there are many different values to "days". How should I do that? If it's not possible with date_add, what's the other solution in SparkR?
Instead of using date_add
directly you should use expr
:
expressiondf_spark <- df_spark %>%
SparkR::mutate(date2 = expr("date_add(to_date(date), days)"))
expressiondf_spark %>% head()
days date date2
1 17000 1970-01-01 2016-07-18
2 17200 1970-01-01 2017-02-03