Search code examples
c#performancemongodbmongodb-.net-drivermongodb-query

MongoDB Most Efficient Way to Perform this Query


I have a requirement, that when given a list of objects, I need to exclude ones which already exist within my database.

I have gone down the traditional route, of iterating over the objects, and one at a time checking if the object exists in my mongo collection.

foreach (PickerPlace pickerPlace in param)
{
    string id = pickerPlace.id;
    IMongoQuery query = Query<Place>.Where(p => p.Id == id);
    int count = this.context.Place.AsQueryable().Count(q => query.Inject());
    if (count == 0)
    {
        filteredResults.Add(pickerPlace);
    }
}

return filteredResults;

Is this the most efficient way of doing what I am trying to achieve, somehow it feels to me like I should be performing some kind of batch operation.

Many Thanks

Update:

I have found the following code to be far far more efficient, but I would still like suggestions on how to improve further.

List<string> ids = param.Select(p => p.id).ToList();
var results = this.context.Place.Find(Query.In("Id", new BsonArray(ids))).ToList();

Solution

  • The most efficient option is this:

    var newIds = new HashSet<string>(param.Select(p => p.Id));
    newIds.ExceptWith(
        places.Find(Query<Place>.In(p => p.Id, newIds))
            .SetFields(Fields<Place>.Include(p => p.Id))
            .Select(p => p.Id));
    

    HashSet enables efficient comparisons using the item's GetHashCode (and Equals). The query returns all the existing items in a single query. The SetFields returns only the ids so the built in _id index (which is probably on RAM) is used and there's no need to even use the actual data files.