Search code examples
azure-cognitive-search

Azure Search using Lucene Query Syntax Returns Incorrect Results


I am using the Microsoft.Azure.Search .NET SDK v5.0.1. I am attempting to perform a search against my Azure Search index as follows: Documents.SearchAsync("fieldname:val* AND timeStamp:2018-05-03T13\:23\:59Z"). The results are incorrect. There are exactly 2 documents in my index with that timestamp. There are 121 documents in my index where the fieldname starts with val. When I run the above query using the SDK, it always returns 121 documents. Is there some special way to query timestamp that I am missing?


Solution

  • There are a few points to make here:

    1. In your index definition, I believe you have timeStamp set to be a String. Otherwise you wouldn't have been able to make the search query as DateTime fields are not searchable. Firstly, I'd advice against treating timeStamp as a string. This is because searchable fields go through a bunch of analysis (tokenization being one of them) Reference on query parsing. In your case, the timestamp query (say 2018-05-03) will be tokenized into smaller constituents (2018, 05, 03) and documents containing any of those terms will be returned. Which is why you observe what you see.

    2. Your scenario seems to be a classic case of "filter" results based on a criteria, followed by "search" on the filtered documents. To accomplish this, you need to do the following:

      • Use a filter on the timestamp, so that it doesn't go through the analysis
      • On the filtered results, apply your search query.
      • Reference
    3. I strongly recommend however that if possible, you should make your timeStamp column a datetime for more reasonable semantics.

    As an example, here's how you'd go about achieving a filter + search combo:

    parameters = new SearchParameters() 
    {
       Filter = "timeStamp eq '2018-05-03'"
    };
    Documents.SearchAsync("fieldname:val*", parameters);