I have a nested json data file. I need to write to Parquet with specific numeric data types. So need to control some fields being Integer, some being Long etc. I can cast the columns, but cannot write them back to the nested location. The data must be in the nested structure.
Here is what I tried:
CREATE TABLE cdg.`test2.parquet` AS SELECT CAST(t.l1.l2.id AS INTEGER) l1.l2.id FROM cdg.`data.json` t;
The error I get is Error: PARSE ERROR: Encountered "."
on the path after the closing bracket of the CAST statement.
AS INTEGER) l1.l2.id FROM c
^
Analysis #1: If I do not put in that nested field it writes the expression result out fine:
+---------+
| EXPR$0 |
+---------+
| 22222 |
| 22222 |
| 22222 |
| 22222 |
Any insights would be greatly appreciated.
This error is complaining that you cannot specify a field name l1.l2.id as alias to the cast expression. Alias should be a top level field.
Alias can be provided to a column with as or without using 'as' keyword (implicitly parser assumes as)
So for example if you want to name a the resultant expression of the cast expression aId then the following Sql statement can be used.
CREATE TABLE cdg.test2.parquet
AS SELECT CAST(t.l1.l2.id AS INTEGER) as Id FROM cdg.data.json
t;
Hope this helps.