Search code examples
scalaapache-sparkapache-spark-sqlgeojson

How to extract Geojson Schema with spark


I have a Geojson file and I want to extract the schema(structtype) correspondent with spark. Any help would be appreciated

I use spark 2.3.1

Geojson:
{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "geometry": {
        "type": "MultiLineString",
        "coordinates": [
          [
            [
              7.0847794888,
              50.7242091272
            ],
            [
              7.0859976701,
              50.7239505872
            ],
             ...
            [
              7.0946504307,
              50.722884129
            ]
          ]
        ]
      },
      "properties": {
        "strecke_id": 3,
        "auswertezeit": "2018-11-13T16:10:00",
        "geschwindigkeit": 26,
        "verkehrsstatus": "erh�hte Verkehrsbelastung"
      }
    },.....

Thank you for your help


Solution

  • val data = spark.read.json("hdfs://........./file.json")
    val schema = data.schema
    

    This should give you the schema in StructType