Search code examples
elasticsearch

How to search in Elasticsearch by matching multiple fields with different priorities?


I have an Elasticsearch index with two fields: “title” and “product_titles”. I want to search for documents that match a query string in both fields, but give more priority to the “title” field. For example, if I search for “A B”, I want the documents that have “A B” in the “title” field to rank higher than the documents that have “A B” in the “product_titles” field. I also want to consider the number of occurrences of the query string in the “product_titles” field as a secondary factor for ranking.

Here is a sample of expected result sorted by score in JSON format:

[
  {
    "title": "A B ...", // title matches
    "product_titles": ["anything", "anything"]
  },
  {
    "title": "anything",
    "product_titles": ["....A B....", " ... A B", "A B ..."] // three matches
  },
  {
    "title": "anything",
    "product_titles": ["....A B....", " ... A B"] // two matches
  },
  {
    "title": "A", // partial match
    "product_titles": ["anything"] // no match
  }
]

Solution

  • This is one way of achieving this, but you'll have to re-index your data. Also, test it thoroughly, it works with the data you provided.

    Create the mapping for title field initially, and also provide a similarity function, which defines how score will be computed for title field. Like this:

    PUT /dummy-index
    {
      "settings": {
        "number_of_shards": 1,
        "similarity": {
          "scripted_tf": {
            "type": "scripted",
            "script": {
              "source": "return Math.pow(doc.freq, query.boost)"
            }
          }
        }
      },
      "mappings": {
        "properties": {
          "title": {
            "type": "text",
            "similarity": "scripted_tf"
          },
          "product_titles": {
            "type": "text"
          }
        }
      }
    }
    

    In this query, we are providing a similarity function for title field. In which we are raising the score for title using power function, rather than multiplying.

    Then, after indexing the data, the following query gives desired results.

    GET dummy-index/_search
    {
      "query": {
        "multi_match": {
          "query": "A B",
          "fields": ["title^2", "product_titles"]
        }
      }
    }