We plan to move Apache Pig code to the new Spark platform.
Pig has a "Bag/Tuple/Field" concept and behaves similarly to a relational database. Pig provides support for CROSS/INNER/OUTER joins.
For CROSS JOIN, we can use alias = CROSS alias, alias [, alias …] [PARTITION BY partitioner] [PARALLEL n];
But as we move to the Spark platform I couldn't find any counterpart in the Spark API. Do you have any idea?
It is oneRDD.cartesian(anotherRDD)
.