Search code examples
elasticsearchazure-cognitive-search

How to search for # in Azure Search


Hi I have a string field which has an nGram analyzer.

And our query goes like this.

$count=true&queryType=full&searchFields=name&searchMode=any&$skip=0&$top=50&search=/(.*)Site#12(.*)/

The test we are searching for has Site#123

The above query will work with all other alpha numeric charecters except #. Any idea how could I make this work.


Solution

  • If you are using the standard tokenizer, the ‘#’ character was removed from indexed documents as it’s considered a separator. For indexing, you can either use a different tokenizer, such as the whitespace tokenizer, or replace the ‘#’ character with another character such as ‘_’ with the mapping character filter (underscore ‘_’ is not considered a separator). You can test the analyzer behavior using the Analyze API : https://learn.microsoft.com/rest/api/searchservice/test-analyzer.

    It’s important to know that the query terms of regex queries are not analyzed. This means that the ‘#’ character won’t be removed by the analyzer from the regex expression. You can learn more about query processing in Azure Search here: How full text search works in Azure Search