Search code examples
hadoophiveparquet

Is it possible to create an external hive table on a parquet file with a different schema?


I have my parquet files structured as follows:

+------+------------------+------------------+
| col1 |       col2       |        col3      |
+------+------------------+------------------+
|  v0  | { k1:v1, k2:v2 } | { k3:v3, k4:v4 } |
+------+------------------+------------------+

col2 and col3 are map columns. And I wish to create a hive table with the below schema on top of this as follows:

+-------+-----+-----+-----+-----+
| col1  |  k1 |  k2 |  k3 |  k4 |
+-------+-----+-----+-----+-----+
|  v0   |  v1 |  v2 |  v3 |  v4 |
+-------+-----+-----+-----+-----+

Is it possible to create the above mapping? I'm familiar with a similar process for creating an external table on a hbase table.


Solution

  • you can do it with the next steps:

    1.Create a temporary table and store the file like it is (with map column type);

    2.Create a second table with the final structure that you need;

    3.Insert from the temporary table into the second table. When you insert you need to use some functions like: trim, split etc. You can use this example.