Search code examples
elasticsearchvectorelasticsearch-painless

ElasticSearch Painless: using vector functions in for loops bug


I ran into what seems to be a bug in Painless where if a vector function is used, say l2norm(), the outcome remains the same outcome as the first iteration. I'm using the painless script in a function score, I hope the query below sheds some light. I'm using the "exception" to see what the value is in each of the iteration, and it's every time the score of the first vector. I know this because I cycled the parameters a couple of times, and the score is everytime "stuck" on the first thing. So what I think is happening is that the function l2norm() (and all vector functions?!) are object instances that can only be instantiated one time? If that would be the case, what would be a work around?

Link to the ES discussion: https://discuss.elastic.co/t/painless-bug-using-for-loops-and-vector-functions/267263

    {
        "query": {
                "nested": {
                        "path": "media",
                        "query": {
                                "function_score": {
                                        "boost_mode": "replace",
                                        "query": {
                                                "bool": {                                                       
                                                        "filter": [{
                                                                "exists": {
                                                                        "field": "media.full_body_dense_vector"
                                                                }
                                                        }]
                                                }
                                        },
                                        "functions": [{
                                                "script_score": {
                                                        "script": {
                                                                "source": "if (params.filterVectors.size() > 0 && params.filterCutOffScore >= 0) {\n  for (int i=0; i < params.filterVectors.size();i++) {\n    def c = params.filterVectors[i];  double euDistance =  l2norm(c, doc['media.full_body_dense_vector']);\n  if (i==1) { throw new Exception(euDistance + ''); }        \n      }\n     return 1.0f;",
                                                                "params": {
                                                                      "filterVectors":[
[1.0,2.0,3.0],[0.1,0.4,0.5]
                                                                        ],
                                                                        "filterCutOffScore": 1.04
                                                                },
                                                                "lang": "painless"
                                                        }
                                                }
                                        }]
                                }
                        }
                }
        },
        "size": 500,
        "from": 0,
        "track_scores": true
}

Solution

  • First off, thanks to Joe for confirming I wasn't imagining things and it's indeed a bug. Second, the lovely ElasticSearch team has been triaging the issue and confirmed it's a bug, so the answer to this post is a link to the Github Issue so in the future, people can track in which ElasticSearch version this behaviour is patched.