Search code examples
scalaapache-sparkanalytics

Transpose in Spark scala logic


I have below dataframe in spark scala dataframe:

-------------
a | b| c| d |
-------------
1 | 2| 3 | 4 |
5 | 6| 7 | 8 |
9 | 10| 11 | 12 |
13 | 14| 15 | 16 |

From my code it becomes a map of every rows and,code I try is:

df.select(map(df.columns.flatMap(c => Seq(lit(c),col(c))):_*).as("map"))
Map(String-> String) with 4 records only
Map(a->1,b->2,c->3,d->4)
Map(a->5,b->6,c->7,d->8)
Map(a->9,b->10,c->11,d->12)
Map(a->13,b->14,c->15,d->16)

But I wanted to change like below:

a->1
b->2 
c->3
d->4
a->5
b->6 
c->7
d->8
a->9
b->10 
c->11
d->12
a->13
b->14
c->15
d->16

Any suggestion to change/add code to get desired result, I think it should be any transpose logic I am kind of new in scala .


Solution

  • Use explode to explode map data.Try below code.

    df.select(map(df.columns.flatMap(c => Seq(lit(c),col(c))):_*).as("map"))
    .select(explode($"map"))
    .show(false)
    

    Without map used array

    val colExpr = array(
        df
        .columns
        .flatMap(c => Seq(struct(lit(c).as("key"),col(c).as("value")).as("map"))):_*
    ).as("map")
    
    df
    .select(colExpr)
    .select(explode($"map").as("map"))
    .select($"map.*").show(false)