Search code examples
apache-drill

How to query on an array?


I have an object like this in drill:

{MyFruit: [{name:Mike, age:10},{name:Jacob,age:9},{name:William, age:6}]}

I can get "Mike" by doing:

Select MyFruit[0].name

Is there a way for me to get the list of every single "name"? I tried the following and it does not like it:

Select MyFruit[*].name

Solution

  • Given this fruits.json file:

    {"MyFruit": [{"name":"Mike", "age":10},{"name":"Jacob","age":9},{"name":"William", "age":6}]}
    

    The Drill statement is:

    select t.flatdata.name from (select flatten(MyFruit) as flatdata from dfs.`/Users/path/fruits.json`) t;
    +----------+
    |  EXPR$0  |
    +----------+
    | Mike     |
    | Jacob    |
    | William  |
    +----------+
    3 rows selected (0.14 seconds)
    

    You need to use a subquery that flattens the complex nesting and table aliases, t and flatdata, that resolve ambiguities.