In my input json data I have root element which is wrapper for my desired data, I would like to remove it and just have target data in each record.
Input data:
{"rootElement": {"firstName": "John", "lastName": "Doe", "age": 11}}
{"rootElement": {"firstName": "Jane", "lastName": "Doe", "age": 33}}
{"rootElement": {"firstName": "Scott", "lastName": "Smith", "age": 22}}
Expected output:
{"firstName": "John", "lastName": "Doe", "age": 11}
{"firstName": "Jane", "lastName": "Doe", "age": 33}
{"firstName": "Scott", "lastName": "Smith", "age": 22}}
I tried this so far:
sparkSession.read.json(inputFileLocation).toDF().map(func => func.getObject("rootElement"))
but won't compile
sparkSession read json return a dataframe already, no need to do toDf()
try this:
val df = sparkSession.read.json("your path")
df.select($"rootElement.*").write.json("your output path")