Search code examples
python-3.6elasticsearch-7

ElasticSearch query is not returning the expected result


I've a json structure as given below:

{"DocumentName":"es","DocumentId":"2","Content": [{"PageNo":1,"Text": "The full text queries enable you to search analyzed text fields such as the body of an email. The query string is processed using the same analyzer that was applied to the field during indexing."},{"PageNo":2,"Text": "The query string is processed using the same analyzer that was applied to the field during indexing."}]}

I need to get stemmed analyzed result for Content.Text field. For that I've created a mapping while creating index.It is given as below:

curl -X PUT "localhost:9200/myindex?pretty" -H "Content-Type: application/json" -d"{
    "settings": {
        "analysis": {
            "analyzer": {
                "my_analyzer": {
                    "tokenizer": "standard",
                    "filter": ["lowercase", "my_stemmer"]
                }
            },
            "filter": {
                "my_stemmer": {
                    "type": "stemmer",
                    "name": "english"
                }
            }
        }
    }
}, {
    "mappings": {
        "properties": {
            "DocumentName": {
                "type": "text"
            },
            "DocumentId": {
                "type": "keyword"
            },
            "Content": {
                "properties": {
                    "PageNo": {
                        "type": "integer"
                    },
                    "Text": "_all": {
                        "type": "text",
                        "analyzer": "my_analyzer",
                        "search_analyzer": "my_analyzer"
                    }
                }
            }
        }
    }
}
}"

I checked the analyzer created :

curl -X GET "localhost:9200/myindex/_analyze?pretty" -H "Content-Type: application/json" -d"{\"analyzer\":\"my_analyzer\",\"text\":\"indexing\"}"

and it gave the result:

{
  "tokens" : [
    {
      "token" : "index",
      "start_offset" : 0,
      "end_offset" : 8,
      "type" : "<ALPHANUM>",
      "position" : 0
    }
  ]
}

But after uploading the json into the index, when I tried searching "index" it is returning 0 results.

res = requests.get('http://localhost:9200') 
es = Elasticsearch([{'host': 'localhost', 'port': '9200'}])
res= es.search(index='myindex', body={"query": {"match": {"Content.Text": "index"}}})

Any help would be much appreciated.Thank You in advance.


Solution

  • Ignore my comment. The stemmer is working. Try the following:

    Mapping:

    curl -X DELETE "localhost:9200/myindex"
    
    curl -X PUT "localhost:9200/myindex?pretty" -H "Content-Type: application/json" -d'
    { 
        "settings":{ 
           "analysis":{ 
              "analyzer":{ 
                 "english_exact":{ 
                    "tokenizer":"standard",
                    "filter":[ 
                       "lowercase"
                    ]
                 }
              }
           }
        },
        "mappings":{ 
           "properties":{ 
              "DocumentName":{ 
                 "type":"text"
              },
              "DocumentId":{ 
                 "type":"keyword"
              },
              "Content":{ 
                 "properties":{ 
                    "PageNo":{ 
                       "type":"integer"
                    },
                    "Text":{ 
                       "type":"text",
                       "analyzer":"english",
                       "fields":{ 
                          "exact":{ 
                             "type":"text",
                             "analyzer":"english_exact"
                          }
                       }
                    }
                 }
              }
           }
        }
     }'
    

    Data:

    curl -XPOST "localhost:9200/myindex/_doc/1" -H "Content-Type: application/json" -d'
    { 
       "DocumentName":"es",
       "DocumentId":"2",
       "Content":[ 
          { 
             "PageNo":1,
             "Text":"The full text queries enable you to search analyzed text fields such as the body of an email. The query string is processed using the same analyzer that was applied to the field during indexing."
          },
          { 
             "PageNo":2,
             "Text":"The query string is processed using the same analyzer that was applied to the field during indexing."
          }
       ]
    }'
    

    Query:

    curl -XGET 'localhost:9200/myindex/_search?pretty' -H "Content-Type: application/json"  -d '
    { 
       "query":{ 
          "simple_query_string":{ 
             "fields":[ 
                "Content.Text"
             ],
             "query":"index"
          }
       }
    }'
    

    Exactly one document is returned - as expected. I've also tested the following stems, they all worked correctly with the proposed mapping: apply (applied), texts (text), use (using).

    Python example:

    import requests
    from elasticsearch import Elasticsearch
    
    res = requests.get('http://localhost:9200')
    es = Elasticsearch([{'host': 'localhost', 'port': '9200'}])
    res = es.search(index='myindex', body={"query": {"match": {"Content.Text": "index"}}})
    
    print(res)
    

    Tested on Elasticsearch 7.4.