Search code examples
elasticsearchelastic-stackelasticsearch-dslelasticsearch-painless

Execute Cosine Similarity inside script_score function if field is present in the document


I am trying to use cosine similarity in script_score function. The query is breaking when the dense vector field is missing in the document against which I am trying to measure similarity is missing.

I spent a lot of time searching how to check if the field is present in document or not, but couldn't succeed.

I tried:

  1. Checking with doc['field_name'] == null

  2. Checking with doc['field_name'].size() == 0

  3. Checking with doc['field_name'].value == null

Query I am using is

POST /sidx-4111c0fc-a8ba-523c-9851-34a2b803643b/_search/
{
  "query": {
    "function_score": {
      "query": {
        "bool": {
          "must": {
            "multi_match": {
              "query": "nri customer bank loan",
              "fields": [],
              "fuzziness": "AUTO"
            }
          }
        }
      },
      "functions": [
        {
          "script_score": {
            "script": {
              "source": "double score =0; score = doc['dense_vector_field'] == null ?0: cosineSimilarity(params.qv, 'dense_vector_field'); if(score>=0.8 && score<=1.0){return 10000+score;} else if(score>=0.60 && score<0.80){return score+1000;} else{return score+100}",
              "params": {
                "qv": [1,1,0,1]
              }
            }
          }
        }
      ],
      "boost_mode": "sum"
    }
  }
}

I am getting following error

  "caused_by" : {
            "type" : "illegal_argument_exception",
            "reason" : "No field found for [dense_vector_field] in mapping with types []"
          }
        }

Solution

  • You can also call doc.containsKey('dense_vector_field') which returns a boolean.

    On a related note, why are you accessing params.queryVector when the only key in your params is qv?