I have the below json
[{"Name":"Tom","Age":"40","Account":"savings","address": {
"city": "New York",
"state": "NY"
}}]
Now I need to create dataframe using spark from this JSON with below structure
Name Age Account city state
Below is code I am using
schema2= StructType([
StructField("TICKET", StringType(), True),
StructField("TRANFERRED", StringType(), True),
StructField("ACCOUNT", StringType(), True),
StructField("address", StructType([StructField('city', StringType(), True), StructField('state', StringType(), True)]), True),
])
path='dbfs:/FileStore/new.json'
df = spark.read.schema(schema2).option("multiLine", True).json(path)
And I am getting below structure
What schema change should be done to flatten the inner json as columns ?
Please check the below code
flattened_df = df.select(
col("Name"),
col("Age"),
col("Account"),
col("address.city").alias("city"),
col("address.state").alias("state")
)