Search code examples
linqmongodbmongodb-.net-driver

Should the MongoDb Official driver translate a Distinct() linq operator into a database operation?


I'm using the MongoDb official driver and issuing a Distinct() operation (which is a supported Linq operation), exactly as per the example.

What I would expect to see in the MongoDb output is some evidence that the Distinct operation is being translated into a Mongo Distinct collection operation, but I see no evidence to support this. What I see is something like this (for a collection containing 50 documents) - there's no sign of a distinct operation being performed:

query Test_5ipsb2hn.Collection query: { Processing: { $exists: false } } ntoreturn:0 ntoskip:0 nscanned:50 keyUpdates:0 locks(micros) r:419 nreturned:50 reslen:1193 0ms

Can anyone shed any light on whether this is the expected behaviour?


Solution

  • From what I can see by turning profiling on the distinct should be sent to mongo. Below are the 2 traces for distinct queries against a test DB.

    {
            "op" : "command",
            "ns" : "test.$cmd",
            "command" : {
                    "distinct" : "testing",
                    "key" : "Value"
            },
            "ntoreturn" : 1,
            "keyUpdates" : 0,
            "numYield" : 0,
            "lockStats" : {
                    "timeLockedMicros" : {
                            "r" : NumberLong(49),
                            "w" : NumberLong(0)
                    },
                    "timeAcquiringMicros" : {
                            "r" : NumberLong(2),
                            "w" : NumberLong(1)
                    }
            },
            "responseLength" : 209,
            "millis" : 0,
            "ts" : ISODate("2013-06-12T23:53:29.872Z"),
            "client" : "127.0.0.1",
            "allUsers" : [ ],
            "user" : ""
    }
    {
            "op" : "command",
            "ns" : "test.$cmd",
            "command" : {
                    "distinct" : "testing",
                    "key" : "Value",
                    "query" : {
    
                    }
            },
            "ntoreturn" : 1,
            "keyUpdates" : 0,
            "numYield" : 0,
            "lockStats" : {
                    "timeLockedMicros" : {
                            "r" : NumberLong(113),
                            "w" : NumberLong(0)
                    },
                    "timeAcquiringMicros" : {
                            "r" : NumberLong(4),
                            "w" : NumberLong(3)
                    }
            },
            "responseLength" : 209,
            "millis" : 0,
            "ts" : ISODate("2013-06-12T23:53:51.730Z"),
            "client" : "127.0.0.1",
            "allUsers" : [ ],
            "user" : ""
    }
    

    The first query is sent from C# via the linq provider:

    mongoAdapter.Collection<TestClass>().AsQueryable().Select(s => s.Value).Distinct().ToList();
    

    The second is the distinct being executed in the command line.

    db.testing.distinct('Value')
    

    Both of the profiling records show the distinct within the command section. The only difference is the second record also shows a query operator, but as this is empty I don't see this effecting the actual distinct query.

    So my short answer is I believe the Linq distinct operation should execute the same query as it would in the shell.

    update

    To pass the linq query through to Mongo you need to make sure the collection is made queryable.

    So if you update you query from

    collection.Find(query).AsQueryable().Select(x =>x.SequencingId) .Distinct();
    

    to

    collection.AsQueryable().Where({you query here}).Select(x =>x.SequencingId) .Distinct();
    

    The problem is because you are executing the Find on the collection, you are actually performing the distinct in memory once the records have been returned to you as an Enumerable and not as part of a mongo query.