Search code examples
sqljsondruid

There's a count difference between Druid Native Query and Druid SQL when using query


I have a problem with Druid Query.

I wanted to get data count with hour granularity.

So, I used Druid SQL like this.

SELECT TIME_FLOOR(__time, 'PT1H') AS t, count(1) AS cnt FROM mydatasource GROUP BY 1

then I got response like this.

[
  {
    "t": "2022-08-31T09:00:00.000Z",
    "cnt": 12427
  },
  {
    "t": "2022-08-31T10:00:00.000Z",
    "cnt": 16693
  },
  {
    "t": "2022-08-31T11:00:00.000Z",
    "cnt": 16694
  },
  ...

But, When using native query like this,

{
  "queryType": "timeseries",
  "dataSource": "mydatasource",
  "intervals": "2022-08-31T07:01Z/2022-09-01T07:01Z",
  "granularity": {
    "type": "period",
    "period": "PT1H",
    "timeZone": "Etc/UTC"
  },
  "aggregations": [
    {
      "name": "count",
      "type": "longSum",
      "fieldName": "count"
    }
  ],
  "context": {
    "skipEmptyBuckets": "true"
  }
}

There's a difference result.

[
  {
    "timestamp": "2022-08-31T09:00:00.000Z",
    "result": {
        "count": 1288965
    }
  },
  {
    "timestamp": "2022-08-31T10:00:00.000Z",
    "result": {
        "count": 1431215
    }
  },
  {
    "timestamp": "2022-08-31T11:00:00.000Z",
    "result": {
        "count": 1545258
    }
  },
  ...

I want to use the result of Native Query.

What's the problem in my Druid SQL query??

How do I create a query to get native query results?


I found what's difference.

when using longSum type aggregation, I get result like native query.

So, I want to know how to query aggregate like below using sql.

"aggregations": [
  {
    "type": "longSum",
    "name": "count",
    "fieldName": "count"
  }
]

Solution

  • I found solution.

    Query like this.

    SELECT TIME_FLOOR(__time, 'PT1H') AS t, sum("count") AS cnt FROM mydatasource GROUP BY 1