Search code examples
sqlmongodbapache-supersetapache-drill

Apache superset with mongoDB(NO SQL database)


I am using MongoDB. My task is to build Dashboard charts for the data. So, I am using Apache superset. I connected MongoDB to apache drill as it wont connect directly with superset. Then connected apache drill to Apachesueperset. My collection is nested. How can I process this nested data to get use for dashboard charts.My data looks as below

{
  "_id": {
    "$oid": "6229d3cfdbfc81a8777e4821"
  },
  "jobs": [
    {
      "job_ID": {
        "$oid": "62289ded8079821eb24760e0"
      },
      "New": false,
      "Expired": false
    },
    {
      "job_ID": {
        "$oid": "6228a252fb4554dd5c48202a"
      },
      "New": true,
      "Expired": true
    },
    {
      "job_ID": {
        "$oid": "622af1c391b290d34701af9f"
      },
      "New": true,
      "Expired": false
    }
  ],
  "email": "[email protected]"
}

I am querying in apache drill as follows

SELECT flat.fill FROM (SELECT FLATTEN(t.jobs) AS fill FROM mongo.recruitingdb.flatten.`Vendorjobs` t) flat WHERE flat.fill.New  = flase;

And i am getting parsing error org.apache.drill.common.exceptions.UserRemoteException: PARSE ERROR: Encountered "." at line 1, column 123.


Solution

  • Superset doesn't really handle nested data very well. Drill does however, so you'll have to craft queries to produce columns that can be visualized.
    Take a look here: https://drill.apache.org/docs/json-data-model/ and here: https://drill.apache.org/docs/querying-complex-data-introduction/.

    UPDATE:

    Try the query below. The FROM clause may not be exactly right, but you should get the idea from this.

    Note that you can access maps in Drill in two ways:

    • tablename.mapname.field OR
    • mapname['field']

    You can do this for any level of nesting.

    SELECT mongoTable.jobs.job_ID.`$oid` AS job_ID,
    mongoTable.jobs.`New` AS new,
    mongoTable.jobs.`Expired` AS expired
    FROM 
    (
       SELECT flatten(jobs) AS jobs
       FROM mongo.recruitingdb.flatten.`Vendorjobs` AS t1
       WHERE t1.jobs.New = false
    ) AS mongoTable