Search code examples
elasticsearchelasticsearch-painless

ElasticSearch, simple two fields comparison with painless


I'm trying to run a query such as SELECT * FROM indexPeople WHERE info.Age > info.AgeExpectancy

Note the two fields are NOT nested, they are just json object

POST /indexPeople/_search
{
  "from" : 0,
  "size" : 200,
  "query" : {
    "bool" : {
      "filter" : [
        {
          "bool" : {
            "must" : [
              {
                "script" : {
                  "script" : {
                    "source" : "doc['info.Age'].value > doc['info.AgeExpectancy'].value",
                    "lang" : "painless"
                  },
                  "boost" : 1.0
                }
              }
            ],
            "adjust_pure_negative" : true,
            "boost" : 1.0
          }
        }
      ],
      "adjust_pure_negative" : true,
      "boost" : 1.0
    }
  },
  "_source" : {
    "includes" : [
      "info"
    ],
    "excludes" : [ ]
  }
}

However this query fails as

{
  "error" : {
    "root_cause" : [
      {
        "type" : "script_exception",
        "reason" : "runtime error",
        "script_stack" : [
          "org.elasticsearch.index.fielddata.ScriptDocValues$Longs.get(ScriptDocValues.java:121)",
          "org.elasticsearch.index.fielddata.ScriptDocValues$Longs.getValue(ScriptDocValues.java:115)",
          "doc['info.Age'].value > doc['info.AgeExpectancy'].value",
          "               ^---- HERE"
        ],
        "script" : "doc['info.Age'].value > doc['info.AgeExpectancy'].value",
        "lang" : "painless",
        "position" : {
          "offset" : 22,
          "start" : 0,
          "end" : 70
        }
      }
    ],
    "type" : "search_phase_execution_exception",
    "reason" : "all shards failed",
    "phase" : "query",
    "grouped" : true,
    "failed_shards" : [
      {
        "shard" : 0,
        "index" : "indexPeople",
        "node" : "c_Dv3IrlQmyvIVpLoR9qVA",
        "reason" : {
          "type" : "script_exception",
          "reason" : "runtime error",
          "script_stack" : [
            "org.elasticsearch.index.fielddata.ScriptDocValues$Longs.get(ScriptDocValues.java:121)",
            "org.elasticsearch.index.fielddata.ScriptDocValues$Longs.getValue(ScriptDocValues.java:115)",
            "doc['info.Age'].value > doc['info.AgeExpectancy'].value",
            "               ^---- HERE"
          ],
          "script" : "doc['info.Age'].value > doc['info.AgeExpectancy'].value",
          "lang" : "painless",
          "position" : {
            "offset" : 22,
            "start" : 0,
            "end" : 70
          },
          "caused_by" : {
            "type" : "illegal_state_exception",
            "reason" : "A document doesn't have a value for a field! Use doc[<field>].size()==0 to check if a document is missing a field!"
          }
        }
      }
    ]
  },
  "status" : 400
}

Is there a way to achieve this? What is the best way to debug it? I wanted to print the objects or look at the logs (which aren't there), but I couldn't find a way to do neither.

The mapping is:

{
  "mappings": {
    "_doc": {
      "properties": {
        "info": {
          "properties": {
            "Age": {
              "type": "long"
            },
            "AgeExpectancy": {
              "type": "long"
            }
          }
        }
      }
    }
  }
}

Solution

  • perhaps you already solved the issue. The reason why the query failed is clear:

    "caused_by" : {
                "type" : "illegal_state_exception",
                "reason" : "A document doesn't have a value for a field! Use doc[<field>].size()==0 to check if a document is missing a field!"
    }
    

    Basically there is one or more document that do not have one of the queried fields. So you can achieve the result you need by using an if to check if the fields do indeed exists. If they do not exist, you can simply return false as follows:

    {
      "script": """
        if (doc['info.Age'].size() > 0 && doc['info.AgeExpectancy'].size() > 0) {
          return doc['info.Age'].value > doc['info.AgeExpectancy'].value
        }
        return false;
    }
    """
    

    I tested it with an Elasticsearch 7.10.2 and it works.

    What is the best way to debug it

    That is a though question, perhaps someone has a better answer for it. I try to list some options. Obviously, debugging requires to read carefully the error messages.

    PAINLESS LAB If you have a pretty recent version of Kibana, you can try to use the painless lab to simulate your documents and get the errors quicker and in a more focused environment.

    KIBANA Scripted Field You can try to create a bolean scripted field in the index pattern named condition. Before clicking create remember to click "preview result": enter image description here

    MINIMAL EXAMPLE Create a minimal example to reduce the complexity. For this answer I used a sample index with four documents with all possible cases.

    1. No info: { "message": "ok"}
    2. Info.Age but not AgeExpectancy: {"message":"ok","info":{"Age":14}}
    3. Info.AgeExpectancy but not Age: {"message":"ok","info":{"AgeExpectancy":12}}
    4. Info.Age and AgeExpectancy: {"message":"ok","info":{"Age":14, "AgeExpectancy": 12}}