Search code examples
azure-cognitive-searchazure-search-.net-sdk

Azure Search - querying for ampersand character


I'm trying to find all records which contain & symbol, which is reserved. I'm using search, not $filter.

According to documentation, it can not be escaped with \%, and should be escaped as HTML url part to %26.

Trying SDK and Search explorer to find any options on how to search, but with no succeed:

  1. &
  2. *&*
  3. *%26*
  4. %26
  5. \%26

UPD

Document example:

{
    "test": "Hello & World"

Search query: search=%26&searchFields=test&$select=test

UPD 2

public class MyType
{
    [System.ComponentModel.DataAnnotations.Key]
    [IsFilterable]
    public string Id { get; set; }

    [IsSearchable, IsSortable]
    public string Test { get; set; }
}

class Program
    {
        private static SearchServiceClient CreateSearchServiceClient()
        {
            string searchServiceName = "XXXXXX";
            string adminApiKey = "XXXXXXX";

            var serviceClient = new SearchServiceClient(searchServiceName, new SearchCredentials(adminApiKey));
            return serviceClient;
        }

        static void Main(string[] args)
        {
            var client = CreateSearchServiceClient();
            var def = new Microsoft.Azure.Search.Models.Index
            {
                Name = "temp-test-reserved1",
                Fields = FieldBuilder.BuildForType<MyType>()
            };
            client.Indexes.Create(def);
            var c = client.Indexes.GetClient("temp-test-reserved1");

            var actions = new IndexAction<MyType>[]
            {
                IndexAction.Upload(new MyType{ Id = "1", Test = "Hello & World" }),
                IndexAction.Upload(new MyType{ Id = "2", Test = "& test start" }),
                IndexAction.Upload(new MyType{ Id = "3", Test = "test end &" })
            };
            c.Documents.Index(IndexBatch.New(actions));
        }
    }

search=%26&searchFields=Test&$select=Test


Solution

  • You likely can't find & because it's being removed at indexing and query time, since it's considered to be punctuation by the default analyzer. This article has more details on lexical analysis in Azure Cognitive Search. If you need to preserve ampersands while still tokenizing text on whitespace boundaries, you'll need to create a custom analyzer.