Search code examples
scalamavendatabricksazure-databricksgeomesa

Installing GeoMesa on Databricks


I'm trying to install GeoMesa in Azure Databricks (Databricks Version 6.6 / Scala 2.11) - trying to follow this tutorial

I have installed GeoMesa in DataBricks using Maven Coordinates org.locationtech.geomesa:geomesa-spark-jts_2.11:2.3.2 as described.

However, when I run import org.locationtech.geomesa.spark.GeoMesaSparkKryoRegistrator it's telling me that it's not found.

All the other imports in this tutorial work just fine:

import org.locationtech.jts.geom._
import org.locationtech.geomesa.spark.jts._

I looked at Geomesa's github, and it seems like it's the correct location.

I'm not super familiar with Java / Scala / Jars.

Not sure what other way I can approach this.

Thanks for help in advance!


Solution

  • Good question! It appears that there's a small error with this tutorial. The GeoMesaSparkKryoRegistrator is used for managing the serialization of SimpleFeatures in Spark.

    This tutorial does not seem to use SimpleFeatures (at least as of August 2020). As such, this import is likely unnecessary. You ought to be able to progress by skipping that import and the registration of the GeoMesaSparkKryoRegistrator.

    The imported module provides just the spatial types and functions necessary for achieving basic geometry support in Spark. To leverage GeoMesa's datastores in Spark, one would import a GeoMesa database-specific spark-runtime jar. Since those datastores use GeoTools SimpleFeatures, that jars would include the GeoMesaSparkKryoRegistrator, and its use would be similar to what is in that notebook and in the documentation on geomesa.org.