Search code examples
sqlarraysselectstructimpala

How to query to select a struct of array of struct column within a dataframe in Impala?


How would I query to select length(int) which is within the array 'details' which is within the 'packets' column? Hopefully, the image attached will explain better than me!

I've tried SELECT packets.details.length FROM test.ssh_data which doesn't work.

This gives me the following error:

illegal column/field reference 'packets.details.length' with intermediate collection 'details' of type 'ARRAY<STRUCT<datestamp:STRING,length:INT>>

Thank you in advance!

description of test.test_data


Solution

  • In the Impala nested types support, arrays and maps are treated as nested tables. You need to reference them in the FROM clause to unnest them. In that case you can add the array to the from clause, taking care to refer to it via sd, the alias of the table it's inside. E.g.

    SELECT d.length FROM test.ssh_data sd, sd.packets.details d