Search code examples
scalaapache-sparkcartesian

Strange behavior of cartesian transformation in scala spark


I am using Cartesian transformation in Scala Spark, and I have a set of 6 coordinates as follow.

coord.collect
res114: Array[(Float, Float, Float)] = Array(
  (43.13,-67.331,-18.137),     
  (63.914,-67.078,-16.894), 
  (23.13,-60.341,-28.117),     
  (53.914,-67.028,-16.824), 
  (63.11,-69.311,-18.117),     
  (61.924,-67.068,-16.874)
)

coord.cartesian.coord gives me following output.

coord.cartesian(coord).collect
res118: Array[((Float, Float, Float), (Float, Float, Float))] = Array(
  ((43.13,-67.331,-18.137),(43.13,-67.331,-18.137)), 
  ((43.13,-67.331,-18.137),(63.914,-67.078,-16.894)), 
  ((43.13,-67.331,-18.137),(23.13,-60.341,-28.117)), 
  ((43.13,-67.331,-18.137),(53.914,-67.028,-16.824)), 
  ((43.13,-67.331,-18.137),(63.11,-69.311,-18.117)), 
  ((63.914,-67.078,-16.894),(43.13,-67.331,-18.137)), 
  ((63.914,-67.078,-16.894),(63.914,-67.078,-16.894)), 
  ((63.914,-67.078,-16.894),(23.13,-60.341,-28.117)),  ((...

Why is the 6th element not ((43.13,-67.331,-18.137),(61.924,-67.068,-16.874))?

Do I some sort of shuffle on which wont let me pick values in order?


Solution

  • Order is irrelevant, as there is no order guarantee ( when it comes to shuffling ).

    As long as you retrieve the correct set of 36 results in total ( with no respect to order ), everything is fine.