How to have different schemas within parquet partitions

I have json files read into data frame. The json can have a struct field messages that is specific to name like below.

Json1
{
   "ts":"2020-05-17T00:00:03Z",
   "name":"foo",
   "messages":[
      {
         "a":1810,
         "b":"hello",
         "c":390
      }
   ]
}

Json2
{
   "ts":"2020-05-17T00:00:03Z",
   "name":"bar",
   "messages":[
      {
         "b":"my",
         "d":"world"
      }
   ]
}

when I read data from jsons into a Dataframe I get schema like below.

root
 |-- ts: string (nullable = true)
 |-- name: string (nullable = true)
 |-- messages: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- a: long (nullable = true)
 |    |    |-- b: string (nullable = true)
 |    |    |-- c: long (nullable = true)
 |    |    |-- d: string (nullable = true)

This is fine. Now when I save to parquet file partitioned by name, how can I have different schemas in foo and bar partitions?

path/name=foo
root
 |-- ts: string (nullable = true)
 |-- name: string (nullable = true)
 |-- messages: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- a: long (nullable = true)
 |    |    |-- b: string (nullable = true)
 |    |    |-- c: long (nullable = true)

path/name=bar
root
 |-- ts: string (nullable = true)
 |-- name: string (nullable = true)
 |-- messages: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- b: string (nullable = true)
 |    |    |-- d: string (nullable = true)

I am fine if I get schema with all fields of foo and bar when I read data from root path. But when I read data from path/name=foo, I am expecting just foo schema.

Solution

1. Partitioning & Storing as Parquet file:

If you save as parquet format then while reading path/name=foo specify the schema including all the required fields(a,b,c), Then spark only loads those fields.

If we won't specify schema then all fields(a,b,c,d) are going to be included in the dataframe

EX:

schema=define structtype...schema
spark.read.schema(schema).parquet(path/name=foo).printSchema()

2.Partitioning & Storing as JSON/CSV file:

Then Spark won't add b,d columns into path/name=foo files, so when we read only the name=foo directory we won't get b,d columns included in the data.

EX:

spark.read.json(path/name=foo).printSchema()
spark.read.csv(path/name=foo).printSchema()