How do I query HTML embedded inside a string inside a JSON file with Apache Drill?

I'm trying to use Apache Drill (for the first time) on a JSON file that looks like this:

{
    "Key1": {
      "htmltags": "<htmltag attr1='bravo' /><htmltag attr2='delta' /><htmltag attr3='charlie' />"
    },
    "Key2": {
      "htmltags": "<htmltag attr1='kilo' /><htmltag attr2='lima' /><htmltag attr3='mike' />"
    },
    "Key3": {
      "htmltags": "<htmltag attr1='november' /><htmltag attr2='foxtrot' /><htmltag attr3='sierra' />"
    }
}

My initial query was the hello world of drill: SELECT * FROM DataFile.json, and returned me the columns Key1, Key2, Key3. They only had one row, and it contained the entry: "<htmltag attr1='bravo' /><htmltag attr2='delta' /><htmltag attr3='charlie' />" [i.e., only the entry Key1.htmltags].

I have two questions:

Why was there only one row returned, when there were three differently valued entries for each key?
After using the KVGEN/FLATTEN functions to get at my strings inside "htmltags" above, is there a way to drill further into (analyse and extract data from) the HTML tags?

Solution

Unfortunately, it looks like Drill isn't the right tool (v1.1.0 as of this writing on Homebrew) for the job.

It looks like there is a bug with the system which is the reason why there is only one row despite multiple columns. I've filed a report: https://issues.apache.org/jira/browse/DRILL-4102
I've scoured the documentation once again, there are no tools to analyse HTML or XML natively. Depending on string manipulation for this is not a task I relish.

Hence, I'll go with an XML parser, DOM tree crawler or the like, and use a bash string function to extract the target tag strings awk/tee.