Search code examples
apache-drill

How to flatten two json arrays using apache drill


{
    "id": "0001",
    "type": "donut",
    "name": "Cake",
    "ppu": 0.55,
    "batters":
        {
            "batter":
                [
                    { "id": "1001", "type": "Regular" },
                    { "id": "1002", "type": "Chocolate" },
                    { "id": "1003", "type": "Blueberry" },
                    { "id": "1004", "type": "Devil's Food" }
                ]
        },
    "topping":
        [
            { "id": "5001", "type": "None" },
            { "id": "5002", "type": "Glazed" },
            { "id": "5005", "type": "Sugar" },
            { "id": "5007", "type": "Powdered Sugar" },
            { "id": "5006", "type": "Chocolate with Sprinkles" },
            { "id": "5003", "type": "Chocolate" },
            { "id": "5004", "type": "Maple" }
        ]
}

I have the above json and i want to flatten both batter and topping arrays.

so i tried doing:

SELECT flatten(topping) as toping,flatten(batters.batter) as bat  FROM json.jsonfiles.`batter.json`;

which gives me

org.apache.drill.common.exceptions.UserRemoteException: VALIDATION ERROR: From line 1, column 43 to line 1, column 49: Table 'batters' not found SQL Query null [Error Id: 33cf80f2-f283-4401-90ce-c262474e0778 on acer:31010]

How can i solve this? Can we flatten two arrays in a single query?


Solution

  • You need to add table alias and refer it in columns. Below query works for me with sample data you have provided.

    SELECT flatten(a.topping) as toping,flatten(a.batters.batter) as bat  FROM  dfs.tmp.`batter.json` a;