Search code examples
elasticsearchnest

Return five following documents from an id with Elasticsearch and NEST


I think I have blinded myself staring at an error over and over again and could really use some input. I have a time-series set of documents. Now I want to find the five documents following a specific id. I start by fetching that single document. Then fetching the following five documents without this id:

var documents = client.Search<Document>(s => s
    .Query(q => q
        .ConstantScore(cs => cs
            .Filter(f => f
                .Bool(b => b
                    .Must(must => must
                        .DateRange(dr => dr.Field(field => field.Time).GreaterThanOrEquals(startDoc.Time))
                    .MustNot(mustNot => mustNot
                        .Term(term => term.Id, startDoc.Id))
                    ))))
    .Take(5)
    .Sort(sort => sort.Ascending(asc => asc.Time))).Documents;

My problem is that while 5 documents are returned and sorted correctly, the start document is in the returned data. I'm trying to filter this away with the must not filter, but doesn't seem to be working. I'm pretty sure I have done this in other places, so might be a small issue that I simply cannot see :)

Here's the query generated by NEST:

{
   "query":{
      "constant_score":{
         "filter":{
            "bool":{
               "must":[
                  {
                     "range":{
                        "time":{
                           "gte":"2020-08-31T10:47:12.2472849Z"
                        }
                     }
                  }
               ],
               "must_not":[
                  {
                     "term":{
                        "id":{
                           "value":"982DBC1BE9A24F0E"
                        }
                     }
                  }
               ]
            }
         }
      }
   },
   "size":5,
   "sort":[
      {
         "time":{
            "order":"asc"
         }
      }
   ]
}

Solution

  • This could be happening because the id field might be an analyzed field. Analyzed fields are tokenized. Having a non-analyzed version, for exact match (like you mentioned in the comments, you have one) and using it within your filter will fix the difference you are seeing.

    More about analyzed vs non-analyzed fields here