I am using this hopelessly inefficient code to establish if a document is already indexed:
foreach (var entry in dic)
{
var response = client.Search<Document>(s => s.Query(q => q.QueryString(d =>
d.Query(string.Format("{0}", entry.Key)))));
if (response.Documents.Count == 0)
{
not_found++;
}
else
{
found++;
}
}
I wonder, if one could send several entry.Key in one batch rather than hitting the endpoint for every id (entry.Key)? Thanks.
Sure!
You can use a terms
filter:
client.Search<Document>(s => s.Query(
q => q.Terms(
c => c
.Field(doc => doc.Id)
.Terms(keys)))
If you are specifically looking for IDs, you can use the ids
filter:
client.Search<Document>(s => s.Query(
q => q.Ids(c => c.Values(keys))
);
If you are only interested in whether or not the document(s) have been indexed, consider limiting the returned fields to only the ID field so you don't waste bandwidth returning the full document:
response = client.Search<Document>(s => s
.Query(q => q.Ids(c => c.Values(keys)) // look for these IDs
.StoredFields(sf => sf.Fields(doc => doc.Id)) // return only the Id field
);
Lastly, if you're only interested in the number of matching documents, then you can ask Elasticsearch to not return any results, and only use the response metadata to count how many documents matched:
response = client.Search<Document>(s => s
.Query(q => q.Ids(c => c.Values(keys))) // look for these IDs
.Size(0) // return 0 hits
);
found += response.Total; // number of total hits