Search code examples
apache-drill

How to cast and write back into a nested column


I have a nested json data file. I need to write to Parquet with specific numeric data types. So need to control some fields being Integer, some being Long etc. I can cast the columns, but cannot write them back to the nested location. The data must be in the nested structure.

Here is what I tried:

CREATE TABLE cdg.`test2.parquet` AS SELECT CAST(t.l1.l2.id AS INTEGER) l1.l2.id FROM cdg.`data.json` t;

The error I get is Error: PARSE ERROR: Encountered "." on the path after the closing bracket of the CAST statement. AS INTEGER) l1.l2.id FROM c ^

Analysis #1: If I do not put in that nested field it writes the expression result out fine:

+---------+ | EXPR$0 | +---------+ | 22222 | | 22222 | | 22222 | | 22222 |

Any insights would be greatly appreciated.


Solution

  • This error is complaining that you cannot specify a field name l1.l2.id as alias to the cast expression. Alias should be a top level field.

    Alias can be provided to a column with as or without using 'as' keyword (implicitly parser assumes as)

    So for example if you want to name a the resultant expression of the cast expression aId then the following Sql statement can be used.

    CREATE TABLE cdg.test2.parquet AS SELECT CAST(t.l1.l2.id AS INTEGER) as Id FROM cdg.data.json t;

    Hope this helps.