Search code examples

Can't create Druid ingestion task through API

When I send JSON ingestion specification to Druid overlord API I get this response:

HTTP/1.1 400 Bad Request
Content-Type: application/json
Date: Wed, 25 Sep 2019 11:44:18 GMT
Server: Jetty(9.4.10.v20180503)
Transfer-Encoding: chunked

    "error": "Instantiation of [simple type, class org.apache.druid.indexing.common.task.IndexTask] value failed: null"

If I change index task type to index_parallel, then I get this:

    "error": "Instantiation of [simple type, class org.apache.druid.indexing.common.task.batch.parallel.ParallelIndexSupervisorTask] value failed: null"

Using same ingestion spec through Druid's web UI works fine.

Here is the ingestion spec that I use(slightly modified to hide sensitive data):

    "type": "index_parallel",
    "dataSchema": {
      "dataSource": "daily_xport_test",
      "granularitySpec": {
        "type": "uniform",
        "segmentGranularity": "MONTH",
        "queryGranularity": "NONE",
        "rollup": false
      "parser": {
        "type": "string",
        "parseSpec": {
          "format": "json",
          "timestampSpec": {
            "column": "dateday",
            "format": "auto"
          "dimensionsSpec": {
            "dimensions": [
                "type": "string",
                "name": "id",
                "createBitmapIndex": true
                "type": "long",
                "name": "clicks_count_total"
                "type": "long",
                "name": "ctr"
    "ioConfig": {
      "type": "index_parallel",
      "firehose": {
        "type": "static-google-blobstore",
        "blobs": [
            "bucket": "data-test",
            "path": "/sample_data/daily_export_18092019/000000000000.json.gz"
        "filter": "*.json.gz$"
      "appendToExisting": false
    "tuningConfig": {
      "type": "index_parallel",
      "maxNumSubTasks": 1,
      "maxRowsInMemory": 1000000,
      "pushTimeout": 0,
      "maxRetry": 3,
      "taskStatusCheckPeriodMs": 1000,
      "chatHandlerTimeout": "PT10S",
      "chatHandlerNumRetries": 5

Overlord API URI looks like this:


HTTPie command to send API request:

http --print=Hhb  POST http://host:8081/druid/indexer/v1/task < test_spec.json

Also, I get the same issue if I try to send ingestion task using DruidHook class in Airflow


  • I found the solution. Apparently, the spec that Druid UI generates comes in a slightly different JSON format than the one that API consumes. High-level objects in the spec("ioConfig", "dataSchema" and "tuningConfig") should be wrapped in spec object, like this:

        "type": "index_parallel",
        "spec": {
            "dataSchema": {
                "dataSource": "daily_xport_test",
                "granularitySpec": {
                    "type": "uniform",
                    "segmentGranularity": "MONTH",
                    "queryGranularity": "NONE",
                    "rollup": false
                "parser": {
                    "type": "string",
                    "parseSpec": {
                        "format": "json",
                        "timestampSpec": {
                            "column": "dateday",
                            "format": "auto"
                        "dimensionsSpec": {
                            "dimensions": [{
                                    "type": "string",
                                    "name": "id",
                                    "createBitmapIndex": true
                                    "type": "long",
                                    "name": "clicks_count_total"
                                    "type": "long",
                                    "name": "ctr"
            "ioConfig": {
                "type": "index_parallel",
                "firehose": {
                    "type": "static-google-blobstore",
                    "blobs": [{
                        "bucket": "data-test",
                        "path": "/sample_data/daily_export_18092019/000000000000.json.gz"
                    "filter": "*.json.gz$"
                "appendToExisting": false
            "tuningConfig": {
                "type": "index_parallel",
                "maxNumSubTasks": 1,
                "maxRowsInMemory": 1000000,
                "pushTimeout": 0,
                "maxRetry": 3,
                "taskStatusCheckPeriodMs": 1000,
                "chatHandlerTimeout": "PT10S",
                "chatHandlerNumRetries": 5