Search code examples
mapreducescalding

Scalding convert one row into multiple


So, I have a scalding pipe that contains entries of the form (String, Map[String, Int]). I need to convert each instance of this row into multiple rows. That is, if I had

( "Type A", ["a1" -> 2, "a2" ->2, "a3" -> 3] )

I need as output 3 rows

("Type A", "a1", 2)

("Type A", "a2", 2)

("Type A", "a3", 3)

Its the inverse of the groupBy operation essentially I guess. Does anyone know of a way to do this?


Solution

  • You can use flatmap, like so:

    class TestJob(args: Args) extends Job(args)
    {
      val inputPipe: TypedPipe[Input]
      val out: TypedPipe[(String, String, Int)]= inputPipe.flatMap { rec => 
        rec.map.map{pair => (rec.kind, pair._1, pair._2)}
      }
    }
    
    case class Input(kind: String, map: Map[String, Int])