Search code examples
c#searchelasticsearch-2.0

Elastic search 2.0 search like query


This is my model:

[ElasticsearchType(Name = "projectmodel")]
public class ProjectModel
{          
    public string name { get; set; }
    public string description { get; set; }
    [Nested]
    [JsonProperty("taskmodels")]
    public List<TaskModel> taskmodels { get; set; }
}

public class TaskModel
{       
    public string title { get; set; }
    public string description { get; set; }
}

I use the following code for search inside the main object and nested object.

        var searchResults = client.Search<ProjectModel>(
            body => body.Query(
                query => query.Bool(
                    bq => bq.Should(
                        q => q.Match(p => p.Field(f => f.name).Boost(6).Query(keyword)),                            
                        q => q.Match(p => p.Field(f => f.description).Boost(6).Query(keyword)),
                            sh => sh.Nested(n => n.Path(p => p.taskmodels).Query(nq => nq.Match(
                                m => m.Query(keyword).Field("taskmodels.description")

                                )
                            )
                            ),
                            sh => sh.Nested(n => n.Path(p => p.taskmodels).Query(nq => nq.Match(
                                m => m.Query(keyword).Field("taskmodels.title")

                                )
                            )
                            )
                        )

                    )
                ).Size(MAX_RESULT)
            );

This searches the object without any issue. But I need to enter the exact word for searchText to get the result.

As an example: name - "Science and technology"

If I search with technology, It returns the record. But If I search with techno, It did not return the record. How can I fix this issue?


Solution

  • You can add match_phrase_prefix query on on your field.

    Match_phrase_prefix will take last token in your searchquery and do a phrase prefix match on it. Order of tokens is important. If you want to search anywhere in the text, then you will need to create n-grams and edge_grams of tokens

    var searchResults = _elasticClient.Search<ProjectModel>(
                body => body.Query(
                    query => query.Bool(
                        bq => bq.Should(
                            q=> q.MatchPhrasePrefix(p=>p.Field(f=>f.name).Query(keyword)) --> note
                            q => q.Match(p => p.Field(f => f.name).Boost(6).Query(keyword)),
                            q => q.MatchPhrasePrefix(p => p.Field(f => f.description).Boost(6).Query(keyword)),
                                sh => sh.Nested(n => n.Path(p => p.taskmodels).Query(nq => nq.Match(
                                    m => m.Query(keyword).Field("taskmodels.description")
    
                                    )
                                )
                                ),
                                sh => sh.Nested(n => n.Path(p => p.taskmodels).Query(nq => nq.Match(
                                    m => m.Query(keyword).Field("taskmodels.title")
    
                                    )
                                )
                                )
                            )
    
                        )
                    ).Size(MAX_RESULT)
                );
    

    Different between match, matchphrase and matchphraseprefix

    Suppose there is a document with field "description" : "science and technology" This text is broken in separate tokens ["science" ,"and","technology"] and stred in inverted index.

    If you want to look for "science technology" , you can use match query. For match order of words don't matter , so you will also get document when you search for "technology science". It just matches tokens.

    If order matters for you then use match_phrase "science and technology" only will return the document.

    If you want to search for partial sentence "science and techno" then use match_phrase_prefix . Prefix match is only performed on last token so you cannot search for "scie and techno". For this there are other options like edge-ngrams and ngrams