Search code examples
mongodbmongodb-querymongodb-.net-driver

Why is my .NET MongoDb Driver Query Horribly Slow?


I run a query to select all items where field1=x, field2 =y, and field3=z directly in Mongo (Robomongo, CLI, whatever) and it takes less than a second on several hundred thousand items:

db.items.find( { 
    $and: [ 
        { CreatingOrgId: 1 }, 
        { LocationId: 941 },
        { StatusId: 1}
    ] 
} )

I then try to run the exact same thing from C# driver and it lags out(only "Aggregate filter" code is relevant, everything else is for context):

FilterDefinition<BsonDocument> locationsFilter;  = Builders<BsonDocument>.Filter.Eq("LocationId", 941);
FilterDefinition<BsonDocument> orgFilter = Builders<BsonDocument>.Filter.Eq("CreatingOrgId", 1);
FilterDefinition<BsonDocument> statusFilter = Builders<BsonDocument>.Filter.Eq("StatusId", 1);

FilterDefinition<BsonDocument> aggregateFilter = locationsFilter & statusFilter & orgFilter;

        List<ItemViewModel> stuffList = mongoItemsCollection
                                             .Find(aggregateFilter)
                                             .Project(x => Mapper.Map<BsonDocument, StuffViewModel>(x))
                                             .ToListAsync().Result;

What's my mistake here? Here are the queries as mongo sees them:

enter image description here

Edit: Looks like mapping the items to item object is what's killing my query to some degree. It's fairly fast (couple of seconds for a huge number of records) without the mapping, like so:

    var rawItems = mongoItemsCollection
                             .Find(aggregateFilter)
                             .ToListAsync().Result;

Edit 2: Looks like automapper is a HUGE part of the problem here (taking the bson "item" object and converting it to a .NET view model). I am still interested in optimization of the .NET --> mongo query itself though (ignoring the automapper part), if anyone wants to answer.


Solution

  • When you run .ToListAsync() in C# the entire results of the query are being accessed and returned.

    Whereas when you run .find() on the command line only 20 (by default) are returned.

    A more equivalent test on the command line would be .find().toArray() which will also access and return all results. Or you could put a limit in your C# query.

    If your full result set is a lot larger than the shell batch size, this could account for some of the difference in your results. This difference will be worse if the query is not covered (i.e. the all the fields in the query and all the fields returned are not in the same index) and if the data accessed is not in memory but is accessed from disk.