Search code examples
c#mongodblinqaggregation-frameworkmongodb-.net-driver

MongoDb.Linq equivalent query for conditional aggregator


I'm grouping my data with MongoDb.Linq in the following way:

group r1 by new { r1.SupId, r1.SupFullName } into g2
select new
{
    g2.Key.SupFullName,
    Volunteers = from v0 in g2
                 where v0.VolId != g2.Key.SupId
                 select new { v0.VolFullName, v0.VolPeopleCount }
}

That is, I want only in Volunteers those which Ids are not the same as the group's key. I was expecting the driver to generate a conditional $push, but instead it gives a plain one:

"$push" : { 
    "VolFullName" : "$VolFullName", 
    "VolPeopleCount" : "$VolPeopleCount" 
}

Now, by reading this post I was able to transform the above query into the following one:

"$push": {
    "$cond": [
        { "$ne": [ "$VolId", "$SupId" ] },
        { 
            "VolFullName": "$VolFullName",
            "VolPeopleCount": "$VolPeopleCount"
        },
        null
     ]
 }

Which gives me the desired results. I'd like to know how, if it's possible, to instruct the csharp driver to generate a query like the above. Thanks in advance!

EDIT: Following @mickl suggestion I was able to generate a conditional $push (and also a conditional $sum) but the result was not as expected. Suppose I have the following docs in my collection:

{ N: "P_1", S: { I: "S_1", FN: "S_1" }, C: { I: "S_1", FN: "S_1" } }
{ N: "P_2", S: { I: "S_1", FN: "S_1" }, C: { I: "S_1", FN: "S_1" } }
{ N: "P_3", S: { I: "S_1", FN: "S_1" }, C: { I: "S_1", FN: "S_1" } }
{ N: "P_4", S: { I: "S_1", FN: "S_1" }, C: { I: "C_2", FN: "C_2" } }
{ N: "P_5", S: { I: "S_1", FN: "S_1" }, C: { I: "C_2", FN: "C_2" } }
{ N: "P_6", S: { I: "S_1", FN: "S_1" }, C: { I: "C_2", FN: "C_2" } }
{ N: "P_7", S: { I: "S_1", FN: "S_1" }, C: { I: "C_3", FN: "C_3" } }
{ N: "P_8", S: { I: "S_1", FN: "S_1" }, C: { I: "C_3", FN: "C_3" } }
{ N: "P_9", S: { I: "S_1", FN: "S_1" }, C: { I: "C_3", FN: "C_3" } }

Here N, I, FN, S and C means name, id, full name, supervisor and creator respectively. So as you can see supervisor S_1 is supervising every doc, has created 3 docs and there are 6 docs created by 2 other creators that are supervised as well by S_1. What I want to do is to summarize those docs by supervisors and then by creators with the amount of docs (people) created. So, my LINQ query looks like:

var q = from p in col.AsQueryable()
        group p by new
        {
           SupFullName = p.S.FN,
           SupId = p.S.I,
           CtorFullName = p.C.FN,
           CtorId = p.C.I
        } into g1
        select new
        {
            g1.Key.SupFullName,
            g1.Key.SupId,
            VolFullName = g1.Key.CtorFullName,
            VolId = g1.Key.CtorId,
            VolPeopleCount = g1.LongCount()
        } into r1
        group r1 by new { r1.SupId, r1.SupFullName } into g2
        select new
        {
           g2.Key.SupFullName,
           g2.Key.SupId,
           Volunteers = g2.Select(v => v.VolId != g2.Key.SupId
                           ? new { v.VolId, v.VolFullName, v.VolPeopleCount }
                           : null),
          SupPeopleCount = g2.Select(v => v.VolId == g2.Key.SupId 
                               ? v.VolPeopleCount 
                               : 0).Sum()
        } into r2
        orderby r2.SupPeopleCount descending
        select r2;

The output that query give me is:

Supervisor: S_1, PeopleCount: 0
   Creator: C_3, PeopleCount: 3
   Creator: C_2, PeopleCount: 3
   Creator: S_1, PeopleCount: 3

Which isn't what I want. Correct output would be

Supervisor: S_1, PeopleCount: 3
   Creator: C_3, PeopleCount: 3
   Creator: C_2, PeopleCount: 3

It generates the conditional aggregators in the following way (If I remove the prefix $_id in the condition and leave just $SupId it works like a charm):

"__agg0": {
      "$push": {
         "$cond": [
             { "$ne": [ "$VolId", "$_id.SupId" ] },
             {
                "VolId": "$VolId",
                "VolFullName": "$VolFullName",
                "VolPeopleCount": "$VolPeopleCount"
             },
             null
         ]
      }
},
"__agg1": {
    "$sum": {
       "$cond": [
           { "$eq": [ "$VolId", "$_id.SupId" ] },
           "$VolPeopleCount",
           NumberLong(0)
       ]
   }
}

Thanks!


Solution

  • You can run Select and use ternary operator in your LINQ statement:

    group r1 by new { r1.SupId, r1.SupFullName } into g2
    select new
    {
        g2.Key.SupFullName,
        Volunteers = g2.Select(x => x.VolId != x.SupId ? new { x.VolFullName, x.VolPeopleCount }: null)  
    };
    

    This gets translated into following aggregation pipeline:

    { 
        "$group" : { 
            "_id" : { "SupId" : "$SupId", "SupFullName" : "$SupFullName" }, 
            "__agg0" : { 
                "$push" : { 
                    "$cond" : [
                            { "$ne" : ["$VolId", "$SupId"] }, 
                            { "VolFullName" : "$VolFullName", "VolPeopleCount" : "$VolPeopleCount" }, 
                            null
                        ] 
                    } 
                } 
            } 
    }, 
    { "$project" : { "SupFullName" : "$_id.SupFullName", "Volunteers" : "$__agg0", "_id" : 0 } }