I have a requirement, that when given a list of objects, I need to exclude ones which already exist within my database.
I have gone down the traditional route, of iterating over the objects, and one at a time checking if the object exists in my mongo collection.
foreach (PickerPlace pickerPlace in param)
{
string id = pickerPlace.id;
IMongoQuery query = Query<Place>.Where(p => p.Id == id);
int count = this.context.Place.AsQueryable().Count(q => query.Inject());
if (count == 0)
{
filteredResults.Add(pickerPlace);
}
}
return filteredResults;
Is this the most efficient way of doing what I am trying to achieve, somehow it feels to me like I should be performing some kind of batch operation.
Many Thanks
Update:
I have found the following code to be far far more efficient, but I would still like suggestions on how to improve further.
List<string> ids = param.Select(p => p.id).ToList();
var results = this.context.Place.Find(Query.In("Id", new BsonArray(ids))).ToList();
The most efficient option is this:
var newIds = new HashSet<string>(param.Select(p => p.Id));
newIds.ExceptWith(
places.Find(Query<Place>.In(p => p.Id, newIds))
.SetFields(Fields<Place>.Include(p => p.Id))
.Select(p => p.Id));
HashSet
enables efficient comparisons using the item's GetHashCode (and Equals). The query returns all the existing items in a single query. The SetFields
returns only the ids so the built in _id
index (which is probably on RAM) is used and there's no need to even use the actual data files.