Search code examples
apache-sparkhadoopmapreduce

Mapping in Spark using Java


I have a file name myFile in the following format:

1,A,2,B
1,A,3,C
2,B,4,D

I want to map the second indexed value of each line to the line itself:

A -> 1,A,2,B
A -> 1,A,3,C
B -> 2,B,4,D

How can I achieve this using Spark Java?


Solution

  • This is how I achieved it

    JavaPairRDD<String, String> pairs = myFile.mapToPair(s->new Tuple2<>(s.split(",")[1], s));