Search code examples
elasticsearchranking

prioritizing keywords over overs elastic search


If I have an elastic search for patent data like:

"query": {
            "bool": {
                "must": [
                    {
                        "bool": {
                            "should": [{"match": {
                                             {"claim": "SOME KEYWORD IN THE CLAUIMS"},
                                             {"claim": "SOME LESS IMPORTANT KEYWORD IN THE CLAIMS"}}]
                        }
                    },
                    {
                        "range": {
                            "date_published": {
                                "gte": start_year,
                                "lte": end_year
                            }
                        }

                    }
                ],
                "should": [],
                "filter": [{
                    "exists": {
                        "field": "title"}}]
            }
        },
        "size": str(size),
        "from": str(offset),

        'language': 'EN',
        "stemming": "true",
    }

how do I tell the search that one keyword is more important than another? The field I'm searching is a text field, so I don't think I can use "Terms" and "weights" (maybe I can, I'm an elastic noob).

Edit: the new query from suggestion looks like:

{
    'query':
        {
            'bool':
                {
                    'must':
                        [{
                             'bool': {
                                 'should': []}},
                         {
                             'range': {
                                 'date_published': {
                                     'gte': '2019-01-01',
                                     'lte': None}}}],
                    'should': [{
                                   'match': {
                                       'title': 'AMBRA1',
                                       'boost': 2}}, {
                                   'match': {
                                       'abstract': 'AMBRA1',
                                       'boost': 2}}, {
                                   'match': {
                                       'claim': 'AMBRA1',
                                       'boost': 2}}, {
                                   'match': {
                                       'title': '',
                                       'boost': 2}}, {
                                   'match': {
                                       'abstract': '',
                                       'boost': 2}}, {
                                   'match': {
                                       'claim': '',
                                       'boost': 2}}],
                    'filter': [{
                                   'exists': {
                                       'field': 'title'}}]}},
    'size': '10',
    'from': '0',
    'language': 'EN',
    'stemming': 'true'}

and it returns a cannot parse query error


Solution

  • Maybe you can use "boost" in your "match" clause. That way the clause with "boost" will have a higher score than the others.

     POST _search
       {
          "query": {
            "bool": {
              "must": [
                {
                  "bool": {
                    "should": []
                  }
                },
                {
                  "range": {
                    "date_published": {
                      "gte": "2019-01-01"
                    }
                  }
                }
              ],
              "should": [
                {
                  "match": {
                    "title": {
                      "query": "AMBRA1",
                      "boost": 2
                    }
                  }
                },
                {
                  "match": {
                    "abstract": {
                      "query": "AMBRA1",
                      "boost": 2
                    }
                  }
                },
                {
                  "match": {
                    "claim": {
                      "query": "AMBRA1",
                      "boost": 2
                    }
                  }
                },
                {
                  "match": {
                    "title": {
                      "query": "",
                      "boost": 2
                    }
                  }
                },
                {
                  "match": {
                    "abstract": {
                      "query": "",
                      "boost": 2
                    }
                  }
                },
                {
                  "match": {
                    "claim": {
                      "query": "",
                      "boost": 2
                    }
                  }
                }
              ],
              "filter": [
                {
                  "exists": {
                    "field": "title"
                  }
                }
              ]
            }
          },
          "size": "10",
          "from": "0"
        }