Search code examples
elasticsearch

Elasticsearch - Filter => Bool => Should(delegate array) - want automation, not hardcoded, how?


Elasticsearch 8.1.2 (Elastic.Clients.Elasticsearch) in C#

Filter => Bool => Should can take a comma separated list of lambda expressions as evident in the framework. The call to Should in my current code is about 2/3 way down in the ful code below. The argument used to get the Field and Query is a List(Field, string) named p_should_contain. This can have a dynamic size.

If I iterate over the list, and set up a Should condition, Should gets overwritten by the current iteration and only the last iteration Should value is used as evident by the DebugInfo.

Should has an overload of Action<QueryDescriptor<T>>[]

However, when I create an array of delegates, runtime results in an error where the delegate values passed to Should are null. Here is how I am attempting to do that:

    Action<QueryDescriptor<T>>[] z = new Action<QueryDescriptor<T>>[p_should_contain.Count];
    boolQuery.Filter(f =>
      f.Bool(b => b.Should(s =>
      {
        for (int i = 0; i < p_should_contain.Count; ++i)
        {
              Action<QueryDescriptor<T>> u = s => s.Match(m => m.Field(p_should_contain[i].Item1).Query(p_should_contain[i].Item2));
              z[i] = u; 
        }
      })));
    
    boolQuery.Filter(f => f.Bool(b => b.Should(z)));

Here is the complete function with a comma separated hardcoded Should setup about 2/3 way down. How can I get this dynamically set up? Apologize for the crap formatting

    public async Task<List<T>?> SearchAsync<T>(
         string p_index_to_search,
         string p_search_term,
         string? p_collapse_property_name,
         Field[] p_should_fields,
         Dictionary<Field, string>? p_must_contain = null,
         List<(Field, string)>? p_should_contain = null
         ) where T : class
    {
        // check the index passed in to determine what index to use

        SearchResponse<T> response = null;

        response = await _client.SearchAsync<T>(search =>
        {
            search.Index(p_index_to_search)
                   .Query(q => q
                       .Bool(boolQuery =>
                       {
                           if (p_must_contain is not null)
                           {
                               foreach (var field in p_must_contain)
                               {
                                   //Add all fields that must match
                                   //if must fields is null nothing will be added
                                   boolQuery.Must(mustQuery => mustQuery
                                               .Match(match => match
                                                   .Field(field.Key)
                                                   .Query(field.Value)
                                               )
                                           );
                               }
                           }
                           
                           
                           // the search match
                           boolQuery.Should(shouldQuery => shouldQuery
                                        .MultiMatch(multiMatch => multiMatch
                                            .Query(p_search_term)
                                            .Fields(p_should_fields)
                                            .Type(Elastic.Clients.Elasticsearch.QueryDsl.TextQueryType.MostFields)
                                            .MinimumShouldMatch(1)
                                        )
                                    );


                           boolQuery.Filter(f =>
                                        f.Bool(b => b.Should(
                                            s => s.Match(m => m.Field(p_should_contain[0].Item1).Query(p_should_contain[0].Item2)), 
                                            s => s.Match(m => m.Field(p_should_contain[1].Item1).Query(p_should_contain[1].Item2))
                                            )));
            .Size(20)
            .Explain();


            //Only perform a collapse on the search object if a collapse property name is passed in
            if (p_collapse_property_name is not null || p_collapse_property_name?.Trim().Length > 0)
            {
                search.Collapse(o => o.Field(new Field(p_collapse_property_name)));
            }
        });

        if (response.IsValidResponse)
        {
            return response.Documents.ToList();
        }

        return null;
    }

I have tried what I have mentioned.


Solution

  • Using what you'd already built I was able to slightly modify and make this work.

    Building the Action<QueryDescriptor>[] array:

        Action<QueryDescriptor<T>>[] actions = new Action<QueryDescriptor<T>>[p_should_contain.Count];
    
        for (int i = 0; i < p_should_contain.Count; i++)
        {
            var (fieldName, queryText) = p_should_contain[i];
            actions[i] = q => q.Match(m => m.Field(fieldName).Query(queryText));
        }
    

    Using the array in a boolQuery Filter:

        if (p_should_contain is not null)
        {
           boolQuery.Filter(filter =>
           {
              filter.Bool(b => b
                       .Should(actions)
                       .MinimumShouldMatch(1)
              );
           });
         }
    

    Produces this query:

        "query": {
        "bool": {
            "filter": {
                "bool": {
                    "should": [{
                            "match": {
                                "fieldNameToSearch": {
                                    "query": "some value"
                                }
                            }
                        }, {
                            "match": {
                                "fieldNameToSearch": {
                                    "query": "some other value"
                                }
                            }
                        }
                    ],
                    "minimum_should_match": 1
                }
            },
    

    Explaining the issue: Lambda expressions capture references to variables, not the variables current value. So when the value of i changes, each lambda expression grabs the current value of i.

    The last loop bumps i to 2 and terminates. However, i is now 2 at this point and the lambda references, while you expect them to be 0, 1 will instead be 2,2 which are out of bounds for both expressions that were built.

    My implementation avoids this by creating the local variables var (fieldName, queryText) so each lambda has a reference to its own local variables.