Search code examples
scalaapache-sparkapache-spark-sqlapache-spark-1.6spark-submit

Which jar has org.apache.spark.sql.types?


I am on Spark 1.x, and attempting to read csv files. If I need to specify some data types, as per the documentation, I need to import the types defined in the package org.apache.spark.sql.types.

import org.apache.spark.sql.types.{StructType,StructField,StringType};

This works fine when I use this interactively in spark-shell, but as I want to run this thru spark-submit, I wrote some Scala code to do this. But, when I attempt to compile my Scala code, it gives me an error saying it could NOT find org.apache.spark.sql.types. I looked up the jar contents of spark-sql, but couldn't find these types defined in there.

So, which jar has org.apache.spark.sql.types?


Solution

  • I looked at the source code for spark-sql at GitHub to realize that these types can be found in the spark-catalyst jar. That didn't seem intuitive.

    Also, since StructType has this code

    org.json4s.JsonDSL._
    

    we end up with another dependent jar - json4s-core.