Search code examples
elasticsearchelasticsearch-pluginelastic-map-reduce

More_like_this query with a filter


I have 1702 documents indexed in elastic search which has category as one of the fields and it also has a field named SequentialId.

I initially fetched the documents with category 1.1 which are between the document 1 and document 850 like below.

**POST testucb/docs/_search
{
    "size": 1702, 
    "query": {
        "bool": {
            "must": [
               {"match": {
                  "Category": "1.1"
               }}
            ],
            "filter":[
                {
                    "range":
                    {
                        "SequentialId":
                        {
                            "gte":1,
                            "lte":850

        }
    }
}
]
}
}
}**

the above query gave me 834 documents which matched category 1.1.(I have the binary to parse out the 834 _ids from the resultant JSON output.) My goal now is to provide these 834 _ids into more_like this query as a training set for the remaining documents which is my test set(docs from sequentialid 851 to 1702 is my test set)

I tried this more_like_this query below with the filter.

POST /testucb/docs/_search
{

"size": 1702, 
    "fields": [
            "SequentialId",
            "Category",
            "PRIMARY_CONTENT_EN"
         ],
   "query": {
      "more_like_this": 
      {
         "fields": [
            "PRIMARY_CONTENT_EN"
         ],
        "like":[
           <-----------834 _ids goes here ---->
            ],
            **"filter":[
                {
                    "range":
                    {
                        "SequentialId":
                        {
                            "gte":851,
                            "lte":1702**

        }
    }
}
],
        "min_term_freq": 1,
        "min_doc_freq": 1,
         "max_query_terms": 15,            
        "min_word_len": 3,

        "stop_words": [
                   ], 
        "boost": 2,
        "include":false
}
}
}

I am getting query parsing exception which says MLT does not support filter. I am not sure how I can provide the remaining documents with sequentialid from 851 to 1702 as my test set .

I hope am clear with what I am expecting to accomplish.Can you guys please help me how to accomplish my task? I am new to elastic search .


Solution

  • If you want to do a more like this query and filter beforehand, you should use a bool query with filter clause (Elasticsearchversion > 2.0)

    POST /testucb/docs/_search
    {
      "size": 1702,
      "fields": [
        "SequentialId",
        "Category",
        "PRIMARY_CONTENT_EN"
      ],
      "query": {
        "bool": {
          "must": [
            {
              "more_like_this": {
                "fields": [
                  "PRIMARY_CONTENT_EN"
                ],
                "like": [
                  <-----------834 _ids goes here ---->
                ],
                "min_term_freq": 1,
                "min_doc_freq": 1,
                "max_query_terms": 15,
                "min_word_len": 3,
                "stop_words": [],
                "boost": 2,
                "include": false
              }
            }
          ],
          "filter": {
            "range": {
              "SequentialId": {
                "gte": 851,
                "lte": 1702
              }
            }
          }
        }
      }
    }
    

    If you use an older version of elasticsearch, you should use the filtered query instead