Search code examples
scalaapache-sparkrddbounding-box

How to find minimum and maximum of points in x and y coordinates


I have a collection of points with x and y coordinates in RDD[Double, Double] format. I want to find the minimum and maximum of both latitudes and longitudes from this RDD. From the minimum and maximum values, my goal is to find the lower left and upper right coordinates of the bounding box of whole space (as shown in the image in this link). https://www.mathworks.com/matlabcentral/mlc-downloads/downloads/submissions/48509/versions/3/previews/COMP_GEOM_TLBX/html/Bounding_box_2D_01.png

This is what I am starting with, but it says "Too many arguments for method min()"

val minX= points.min(p=>p(0))
val minY=points.min(p=>p(1))
val maxX=points.min(p=>p(0))
val maxY=points.max(p=>p(1))

I am new to Scala and please pardon me if this seems like a simple problem.


Solution

  • I assume that you meant RDD[(Double, Double)]. You could first convert each point into a zero-area bounding box, and then reduce all bounding boxes, combining all four values at once:

    val boundingBox = 
      points
      .map{ case (x, y) => (x, x, y, y) }
      .reduce { case ((xl, xr, xb, xt), (yl, yr, yb, yt)) =>
        (xl min yl, xr max yr, xb min yb, xt max yt)
      }
    

    The l, r, b, t shortcuts mean left, right, bottom, top.

    Your code with min does not work, because RDD.min does not take any arguments (at least not in the first, non-implicit argument list). There also seems to be no minBy / maxBy.