Search code examples
elasticsearchelasticsearch-painless

what exactly does painless do under the hood of painless script's Array contains method


Suppose that i have a doc mapping field like below

{
    "template": {
        "mappings":{
            "template":{
                "properties": {
                    "sth": {
                        "type": "long"
                    }
                }
            }
        }
    }
}

The field sth is the type of array.

I want to check whether the field sth contains a value or not, so i write painless script like doc['sth'].values.contains(1)

It failed, and i read this article, understand why it fail, since the i must pass a Long to the contains method, so i change my painless script to doc['sth'].values.contains(1L)

It works, but some further experiment exhaust me more.

The script

doc['sth'].values[0] == 1 and doc['sth'].values[0] == 1L

both can work, ok, i read the painless document, understanding that the integer type will be promoted as long type.

But doc['sth'].values[0] == Integer.valueOf(156) also can work, however, according to the document

If a comparison is made between a primitive type value and a reference type value.

Should it raise an error? Or automatically unboxing/boxing happens somewhere?

Nevertheless, i wrote script doc['sth'].values[0] instanceof Long and doc['sth'].values[0] instanceof long, both can work and return true.

Is painless array type store the primitive type, or the boxing reference type?

Finally, comes to the topic question, what exactly does painless do under the hood of painless script's Array contains method.

It's just my irresponsible guess

Is Array::contains have a signature like contains(Object o), and use == to compare the parameter with its storage?

But if its the truth, why does doc['sth'].values[0] == 1 success but doc['sth'].values[0] == 1 fail?


Solution

  • values is actually of type ScriptDocValues (an abstract subclass of AbstractList). For a field typed with long, the actual concrete implementation is ScriptDocValues.Longs.

    So to answer your question, under the hood, the .contains() method will simply delegate to AbstractCollection.contains() which calls .equals() under the hood. This is why doc['sth'].values.contains(1) fails while doc['sth'].values.contains(1L) succeeds.

    As you can see, the values are internally stored in a long array (primitive type). However, the get() method returns a Long (boxed type) in order to satisfy the AbstractList contract and the getValue() returns a primitive long unboxed from the call to get(). This is why both doc['sth'].values[0] == 1 and doc['sth'].values[0] == 1L succeed.

    If you call get() and there's no doc value, you'll get an IllegalStateException stating that you should...

    use doc[].size()==0 to check if a document is missing a field!