Search code examples
pysparkapache-spark-sqlazure-databrickssql-function

How to use column names with spaces in stack function in pyspark


I tried to unpivot the dataframe and dataframe has folowing structure

fstcol col 1 col 2
One 1 4
one 2 5
One 3 6

And I want the dataframe like this :

fstcol col_name value
One col 1 1
one col 1 2
One col 1 3
One col 2 4
one col 2 5
One col 2 6

I have written following code to transform:

df.selectExpr("fstcol","stack(2, 'col 1', col 1, 'col 2', col 2)")

however, I am getting an error as column names contains space. It is unable to get the column values for 'col 1' and 'col 2'.

Can anyone help me to resolve this?


Solution

  • You must use backtick:

    df.selectExpr("fstcol", "stack(2, 'col 1', `col 1`, 'col 2', `col 2`) as (col_name, value)")