Search code examples
elasticsearchelasticsearch-aggregationelasticsearch-6

Complex query in elasticsearch


I have a document structure which has prices with discounts. Discounts are applied only when all the conditions for that discount are met.
I tried hard to figure out a query that solves my use case mentioned below, but I guess that I will have to end up using the script feature. Is there a way we can frame a query for the below use case without scripting?

{
  "name": "Brown Teddy With Hat",
  "prices": {
    "basePrice": {
      "price": 3000, // actual price
      "discounts": [
        { "price": -10, "conditions": ["value1"] },
        { "price": -20, "conditions": ["value5", "value7"] }, // two conditions 
        { "price": -30, "conditions": ["value6"] },
        { "price": -40, "conditions": [] } // no conditions
      ]
    },
    "installmentPrice": {...} // ignore for now
  }
}

I want to sum all the discounts for which conditions are met. There will be an input array containing some condition values like [value5, value6, value1]. Discount conditions should be a subset of the input array.
In the above example: Discount -10, -30 and -40 will be applied. I want to subtract the discount from the actual price (3000) and then sort on the discounted price.

I tried the terms_set query but that works on simple cases like a document with an array field and matching the input against that array, but for my case discounts are multiple and nested and terms_set doesn't work correctly.

Elasticsearch version - 6.6.1
References that didn't help:
https://discuss.elastic.co/t/subsets/46488
https://discuss.elastic.co/t/match-all-terms-in-document-array/24793


Solution

  • Finally, I was able to use the terms_set after spending some time.
    I also added a field in the discount which is requiredMatch.
    For me, the value of the requiredMatch field is equal to the length of the conditions array. We can also avoid adding this field and use the script to access the length of the conditions array.
    The case with empty conditions array doesn't work. I will hack it by always adding a "default" value to the conditions array as well as the input array. So empty conditions will then always match as expected.

    Here is the query that solves my use case:

    {
      "aggs": {
        "sorted_by_price": {
          "aggs": {
            "nested_discounts_aggs": {
              "nested": {
                "path": "prices.basePrice.discounts"
              },
              "aggs": {
                "conditions_set_match": {
                  "filter": {
                    "terms_set": {
                      "prices.basePrice.discounts.conditions": {
                        "terms": [
                          "facebook",
                          "agreement12",
                          "bestxl",
                          "useless",
                          "magenta"
                        ],
                        "minimum_should_match_field": "prices.basePrice.discounts.requiredMatch"
                      }
                    }
                  },
                  "aggs": {
                    "finalPrice": {
                      "sum": {
                        "field": "prices.basePrice.discounts.price"
                      }
                    }
                  }
                }
              }
            }
          },
          "terms": {
            "field": "_id",
            "order": {
              "nested_discounts_aggs>conditions_set_match>finalPrice": "desc"
            },
            "size": 5
          }
        }
      }
    }
    

    The new doc looks like:

    {
      "name": "Brown Teddy With Hat",
      "prices": {
        "basePrice": {
          "discounts": [
            { "price": 3000, "conditions": ["default"], "requiredMatch": 1 }, // actual price
            { "price": -10, "conditions": ["value1"], "requiredMatch": 1 },
            { "price": -20, "conditions": ["value5", "value7"], "requiredMatch": 2 }, // two conditions 
            { "price": -30, "conditions": ["value6"], "requiredMatch": 1 },
            { "price": -40, "conditions": ["default"], "requiredMatch": 1 } // no conditions
          ]
        },
        "installmentPrice": {...} // ignore for now
      }
    }
    

    The only part that remains is adding price to the discounts. I will accomplish that by removing the price field and including actual price as a discount with a positive value. Please note discounts have negative values.