Search code examples
azure-cognitive-search

How to get Azure Search results that contains at least one complete word?


I'm testing the following query in Azure Search portal but is not giving me the expected results. I want as result any document having at least one ocurrence of algo word.

search=algo&queryType=full&searchMode=any

Important: MyVal is Searchable and have Lucene Analyzer (spanish)

Expected element result:

{
    "@odata.context": "https://....windows.net/indexes(....)/$metadata#docs(*)",
    "value": [
        {
            "MyKey":"1",
            "MyValues":[
                {
                    "MyVal":"algo aqui"
                },
                {
                    "MyVal":"lala"
                },
            ]
        }
    ]
}

NOT expected element result:

{
    "@odata.context": "https://....windows.net/indexes(....)/$metadata#docs(*)",
    "value": [
        {
            "MyKey":"1",
            "MyValues":[
                {
                    "MyVal":"algoOtherStuff aqui"
                },
                {
                    "MyVal":"lala"
                },
            ]
        }
    ]
}

Result got:

{
    "@odata.context": "https://....windows.net/indexes(....)/$metadata#docs(*)",
    "value": []
}

More example queries and results

search=algo*&queryType=full&searchMode=any

[No results]


search=/.algo./&queryType=full&searchMode=any

[No results]


search=algo aqui&queryType=full&searchMode=any

[EXPECTED RESULT!!!] (element found)


search=aqui&queryType=full&searchMode=any

[EXPECTED RESULT!!!] (element found)


IMPORTANT: If I change the words for two others to test like: "some data" or "something special" and search by one of those, Azure Search is returning the expected results. Seems like a problem with "algo" particular word.


Solution

  • Ok, I was able to reproduce the problem with the following code:

    var client = new SearchServiceClient("xxxx", new SearchCredentials("abcabc"));
    
                client.Indexes.Create(new Microsoft.Azure.Search.Models.Index
                {
                    Name = "index",
                    Fields = new List<Field>
                    {
                        new Field("Id", DataType.String){ IsKey = true, IsRetrievable = true, IsFilterable = true},
                        Field.NewComplex("MyValues", true, new List<Field> { new Field("MyVal", DataType.String)
                            {
                                IsRetrievable = true,
                                IsFilterable = true,
                                IsSearchable =true,
                                Analyzer = AnalyzerName.EsLucene
                            }
                        })
                    }
                });
    
                var docs = new List<CustomDoc> {
                    new CustomDoc { Id = "1", MyValues = new MyValues[] { new MyValues { MyVal = "algo aqui" }, new MyValues { MyVal = "lala" }} },
                    new CustomDoc { Id = "2", MyValues = new MyValues[] { new MyValues { MyVal = "something else" }, new MyValues { MyVal = "xxx" }} },
                };
    
                var indexClient = client.Indexes.GetClient("index");
                indexClient.Documents.Index(IndexBatch.Upload(docs));
    

    And yes, you were right. "Algo" is considered a stop word in the StandardLucene analyzer (spanish) :

    https://github.com/apache/lucene-solr/blob/master/lucene/analysis/common/src/resources/org/apache/lucene/analysis/snowball/spanish_stop.txt

    Changing to EsMicrosoft analyzer returns searches for "algo":

    client.Indexes.Create(new Microsoft.Azure.Search.Models.Index
                {
                    Name = "index",
                    Fields = new List<Field>
                    {
                        new Field("Id", DataType.String){ IsKey = true, IsRetrievable = true, IsFilterable = true},
                        Field.NewComplex("MyValues", true, new List<Field> { new Field("MyVal", DataType.String)
                            {
                                IsRetrievable = true,
                                IsFilterable = true,
                                IsSearchable =true,
                                Analyzer = AnalyzerName.EsMicrosoft
                            }
                        })
                    }
                });