python pyspark apache-spark-sql window-functions

Get "circular lag" of a column

I would like to create a new column in a pyspark.sql.DataFrame based on lagged values of an existing column. But... I would also like the last values to become the first ones, and the first values to become the last ones. Here is an example :

df = spark.createDataFrame([(1,100),
                            (2,200),
                            (3,300),
                            (4,400),
                            (5,500)],
                            ['id','value'])

df.show()

+---+-----+
| id|value|
+---+-----+
|  1|  100|
|  2|  200|
|  3|  300|
|  4|  400|    
|  5|  500|
+---+-----+

And the desired output would be :

+---+-----+----------------+-----------------+
| id|value|lag_value_plus_2|lag_value_minus_2|
+---+-----+----------------+-----------------+
|  1|  100|             300|              400|
|  2|  200|             400|              500|
|  3|  300|             500|              100|
|  4|  400|             100|              200|
|  5|  500|             200|              300|
+---+-----+----------------+-----------------+

I can feel it has something to do with window functions or pyspark.sql.lag function, but can't figure out how to do.

Solution

Here is one solution I can offer. But I'm not sure it is the most optimized one :

from functools import reduce                                                                                       

# Duplicate the dataframe twice, one "before" and one "after"
df = reduce(
    lambda a, b : a.union(b), 
    [df.withColumn("x", F.lit(i)) for i in [-1,0,1]] 
)

df.withColumn(
    "lag_value_plus_2",
    F.lead("value", 2).over(Window.partitionBy().orderBy("x", "id"))
).withColumn(
    "lag_value_minus_2",
    F.lag("value", 2).over(Window.partitionBy().orderBy("x", "id"))
).where("x=0").drop("x").show()

+---+-----+----------------+-----------------+
| id|value|lag_value_plus_2|lag_value_minus_2|
+---+-----+----------------+-----------------+
|  1|  100|             300|              400|
|  2|  200|             400|              500|
|  3|  300|             500|              100|
|  4|  400|             100|              200|
|  5|  500|             200|              300|
+---+-----+----------------+-----------------+