Search code examples
elasticsearch

ElasticSearch text array length


i use elastic search 8.2, i have a little trouble with text array field. In case text array contains same string (for example ["a","a"]) doc['my_array'].length return 1.

Step to reproduce:

I create two document:

PUT my-index-000001/_doc/1
{
  "my_field": 10,
  "my_array": ["a","b"]
}

PUT my-index-000001/_doc/2
{
  "my_field": 10,
  "my_array": ["a","a"]
}

later i mapped my_array type field:

PUT my-index-000001/_mapping
{
  "properties": {
    "my_array": { 
      "type":     "text",
      "fielddata": true
    }
  }
}

At this point i start my query that return the wrong array lenght

{
  "script_fields": {
    "my_array_length": {
      "script": { 
        "source": "doc['my_array'].length", 
        "params": {
          "multiplier": 2
        }
      }
    }
  }
}

Response:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "my-index-000001",
        "_id" : "1",
        "_score" : 1.0,
        "fields" : {
          "my_array_length" : [
            2
          ]
        }
      },
      {
        "_index" : "my-index-000001",
        "_id" : "2",
        "_score" : 1.0,
        "fields" : {
          "my_array_length" : [
            1      
          ]
        }
      }
    ]
  }
}

the document with _id: 2 return 1

Someone tell me why and how can i return the correct array lengh?


Solution

  • Trick: Use params['_source'].my_array.length instead of doc['my_array'].length.

    GET array_size/_search
    {
      "script_fields": {
        "number_of_arrays": {
          "script": {
            "source": "params['_source'].my_array.length"
          }
        }
      }
    }
    

    Note: it's recommended to use an ingest pipeline and calculate the array length during indexing. Check the following link for a similar discussion: https://discuss.elastic.co/t/how-to-get-array-size-for-each-document/226599