If i define a window object like this :
w_score = Window().partitionBy("HUB_ID").orderBy(F.col("coefficient").desc(), F.col("LastUpdateDate").desc())
And another one, like this :
w_score_prim = w_score.orderBy(F.col("Primary"))
The new object, w_score_prim
is equivalent to which one ? :
Window().partitionBy("HUB_ID").orderBy(F.col("coefficient").desc(), F.col("LastUpdateDate").desc(), F.col("Primary"))
or
Window().partitionBy("HUB_ID").orderBy(F.col("Primary"))
In multiple orderBy()
call, last orderBy
overwrites all previously called orderBy
. So to answer your question - it would be
Window().partitionBy("HUB_ID").orderBy(F.col("Primary"))
It can be better illustrated using below 3 use cases:
a
(ascending) and then b
(ascending) try this example.orderBy()
twice (or multiple times) then it considers the last call as the final ordering call. So in this example it orders by column b
'only'.b
and has ignored column a
ordering.import pyspark.sql.functions as f
from pyspark.sql.window import Window
df = sc.parallelize([
[1, 3, 1, 10],
[1, 0, 2, 20],
[3, 2, 3, 30],
[2, 1, 6, 40],
[2, 1, 4, 50],
[2, 2, 5, 60]
]).toDF(('id', 'a', 'b', 'val'))
#case 1
w = Window.partitionBy().partitionBy("id").orderBy(f.col("a"), f.col("b"))
df1 = df.withColumn("c", f.sum(f.col("val")).over(w))
df1.show()
#case 2
w_2 = Window.partitionBy().partitionBy("id").orderBy(f.col("a")).orderBy(f.col("b"))
df2 = df.withColumn("c", f.sum(f.col("val")).over(w_2))
df2.show()
#case 3
w_3 = Window.partitionBy().partitionBy("id").orderBy(f.col("a"))
w_3 = w_3.orderBy("b")
df3 = df.withColumn("c", f.sum(f.col("val")).over(w_3))
df3.show()
Output is:
+---+---+---+---+---+
| id| a| b|val| c|
+---+---+---+---+---+
| 1| 0| 2| 20| 20|
| 1| 3| 1| 10| 30|
| 3| 2| 3| 30| 30|
| 2| 1| 4| 50| 50|
| 2| 1| 6| 40| 90|
| 2| 2| 5| 60|150|
+---+---+---+---+---+
+---+---+---+---+---+
| id| a| b|val| c|
+---+---+---+---+---+
| 1| 3| 1| 10| 10|
| 1| 0| 2| 20| 30|
| 3| 2| 3| 30| 30|
| 2| 1| 4| 50| 50|
| 2| 2| 5| 60|110|
| 2| 1| 6| 40|150|
+---+---+---+---+---+
+---+---+---+---+---+
| id| a| b|val| c|
+---+---+---+---+---+
| 1| 3| 1| 10| 10|
| 1| 0| 2| 20| 30|
| 3| 2| 3| 30| 30|
| 2| 1| 4| 50| 50|
| 2| 2| 5| 60|110|
| 2| 1| 6| 40|150|
+---+---+---+---+---+