python dataframe pyspark pivot transpose

Pyspark Pivot table with Multiple columns

I am trying to pivot a dataframe with one key and multiple values across different columns . How do I do this in pyspark ? I have used pivot with one key value pair before and trying to figure this out.

Sample dataframe

id	test_id	test_status	key	score1	score2	score3
ABC	1	complete	q1	1	2	3
ABC	1	complete	q2	4	5	6
ABC	2	complete	q1	1	6	7
ABC	2	complete	q2	5	6	7

expected dataframe

id	test_id	test_status	q1_score1	q1_score2	q1_score3	q2_score1	q2_score2	q2_score3
ABC	1	complete	1	2	3	4	5	6
ABC	2	complete	1	6	7	5	6	7

Solution

You can do multiple columns pivot.

df = (df.groupby('id', 'test_id', 'test_status')
      .pivot('key')
      .agg(*[F.first(x).alias(x) for x in ['score1', 'score2', 'score3']]))