I want to create a column using pyspark that contains the date which is 3 years prior to the date in a given column. The date column looks like this :
date
2018-08-01
2016-08-11
2014-09-18
2018-12-08
2011-12-18
And I want this result :
date past date
2018-08-01 2015-08-01
2016-08-11 2013-08-11
2014-09-18 2011-09-18
2018-12-08 2015-12-08
2011-12-18 2008-12-18
You can use date_sub
function.
Here is Scala code which will be very to python.
df.withColumn("past_date",date_sub(col("date"), 1095))