Search code examples
pythondataframepysparktranspose

How do I transpose a pyspark dataframe?


Parameter Value Data_type
window 1024 Data1
noverlap 256 Data1
ylim_min 0 Data1
ylim_max 500 Data1
mag_min 0 Data1
max_max 30 Data1
window 2500 Data2
noverlap 64 Data2
ylim_min 0 Data2
ylim_max 50 Data2
mag_min 0 Data2
mag_max 2500 Data2

How do I transpose this pyspark data frame such as:

enter image description here


Solution

  • It is almost same as pandas dataframe

    Let the dataframe is df

    pivotdf= df.groupBy("Data_Type").pivot("Parameter").sum("Value")
    pivotdf.show()
    

    Here we are pivoting the column Parameter by grouping the column Data_Type