Search code examples
prestoapache-superset

Presto subqueries: Key not present in map


I have been banging my head a while to Superset -> Presto (PrestoSQL) -> Prometheus combination (as Superset does not yet support Prometheus) and got stymied with an issue when trying to extract columns from Presto's map type column containing Prometheus labels.

In order to get necessary labels mapped as columns from Superset's point of view, I create extra table (or I guess a view in this case) in Superset on top of existing table which had following SQL for creating the necessary columns:

SELECT labels['system_name'] AS "system",labels['instance'] AS "instance","timestamp" AS "timestamp","value" AS "value" FROM "up"

This table is then used as a data source in Superset's chart which treats it as a subquery. The resulting SQL query created by Superset and then sent to Presto looks e.g. like this:

SELECT "system" AS "system",
       "instance" AS "instance",
       "timestamp" AS "timestamp",
       "value" AS "value"
FROM
  (SELECT labels['system_name'] AS "system",
          labels['instance'] AS "instance",
          "timestamp" AS "timestamp",
          "value" AS "value"
   FROM "up") AS "expr_qry"
WHERE "timestamp" >= from_iso8601_timestamp('2020-10-19T12:00:00.000000')
  AND "timestamp" < from_iso8601_timestamp('2020-10-19T13:00:00.000000')
ORDER BY "timestamp" ASC
LIMIT 250;

However, what I get out from above is an error:

io.prestosql.spi.PrestoException: Key not present in map: system_name
    at io.prestosql.operator.scalar.MapSubscriptOperator$MissingKeyExceptionFactory.create(MapSubscriptOperator.java:173)
    at io.prestosql.operator.scalar.MapSubscriptOperator.subscript(MapSubscriptOperator.java:143)
    at io.prestosql.$gen.CursorProcessor_20201019_165636_32.filter(Unknown Source)

After reading a bit about queries from Presto's user guide, I tried a modified query from command line by using WITH:

WITH x AS (SELECT labels['system_name'] AS "system",labels['instance'] AS "instance","timestamp" AS "timestamp","value" AS "value" FROM "up")
SELECT system, timestamp, value FROM x
WHERE "timestamp" >= from_iso8601_timestamp('2020-10-19T12:00:00.000000')
  AND "timestamp" < from_iso8601_timestamp('2020-10-19T13:00:00.000000')
LIMIT 250;

And that went throught without any issues. But it seems that I have no way to define how Superset executes its queries, so I'm stuck with the first option. The question is, is there anything wrong with it which could be fixed?

I guess that one option (if everything else fails) would be defining extra tables in Presto side which would do the same trick for mapping the columns, thus hopefully avoiding above issue.


Solution

  • The map subscript operator in Presto requires that the key be present in the map. Otherwise, you get the failure you described.

    If some keys can be missing, you can use the element_at function instead, which will return a NULL result:

    Returns value for given key, or NULL if the key is not contained in the map.