I'd like to perfom groupBy()
operation with specific agg()
.
df = df.groupBy("x", "y").agg(F.max("a").alias("a"), F.max("b").alias("b"))
But is there any way to aggregation using list of columns? I don't want to hardcode it.
You can use list comprehension.
list_of_cols = ["a", "b"]
df = df.groupBy("x", "y").agg(*[F.max(x).alias(x) for x in list_of_cols])