Pivot with custom column names in pyspark

I need to pivot the Table with custom column name's. Please see the table format below.

+-------+------+------+--------+
|balance| ex_cy|rp_prd|scenario|
+-------+------+------+--------+
|    100|2022Q2|202204|     abc|
|    200|2022Q2|202204|     abc|
|    300|2022Q2|202204|     abc|
+-------+------+------+--------+

I want to convert this into below format:

+------+------+--------+---+---+---+
| ex_cy|rp_prd|scenario| C1| C2| C3|
+------+------+--------+---+---+---+
|2022Q2|202204|     abc|100|200|300|
+------+------+--------+---+---+---+

Can anyone help on this please??

Solution

Lets create an additional id column to uniquely identify rows per 'ex_cy', 'rp_prd' and 'scenario', then do a groupby + pivot and aggregate balance with first

cols = ['ex_cy', 'rp_prd', 'scenario']

W = Window.partitionBy(*cols).orderBy('balance')
id_ = F.concat(F.lit('C'), F.row_number().over(W))

result = (
    df
    .withColumn('id', id_)
    .groupBy(*cols).pivot('id').agg(F.first('balance'))
)

Result

+------+------+--------+---+---+---+
| ex_cy|rp_prd|scenario| C1| C2| C3|
+------+------+--------+---+---+---+
|2022Q2|202204|     abc|100|200|300|
+------+------+--------+---+---+---+