Search code examples
mongodbaggregation

MongoDB aggregation pipeline project several values from the same id


I am struggling with mongoDB pipelines. I am working on a MERN stack that processes data.

We ask people questions with a form, a form is described as a session, e.g. one session per people. Each session is recorded in a table like that :

{ _id: 1, created_at:"01/01/2021"}
{ _id: 2, created_at:"02/01/2021"}
{ _id: 3, created_at:"03/01/2021"}

All of their answers are stored in a table like that the foreign key being sessionId:

{ _id: 1, value:"Name1", sessionId : 1, typeofField :"name"}
{ _id: 2, value:"Firstname1", sessionId : 1, typeofField :"firstname"}
{ _id: 3, value:"Date of birth1", sessionId : 1, typeofField :"birthdate"}
{ _id: 4, value:"Name2", sessionId : 2, typeofField :"name"}
{ _id: 5, value:"Firstname2", sessionId : 2, typeofField :"firstname"}
{ _id: 6, value:"Date of birth2", sessionId : 2, typeofField :"birthdate"}

How can I project this data to have every informations of a session in order like this :

{id :1, created_at:"01/01/2021", name : "Name1", firstname: "Firstname1", birthdate : "Date of Birth1"}
{id :2, created_at:"02/01/2021", name : "Name2", firstname: "Firstname2", birthdate : "Date of Birth2"}

Solution

  • An alternate solution, going in the other direction (from answers to session):

    c = db.answers.aggregate([
        // Bring all answers together as a k-v array:                                                                     
        {$group: {_id: "$sessionId", flds: {$push: {k: "$typeofField", v: "$value"}}}}
    
        // Do a 1:1 lookup:                                                                                               
        ,{$lookup: {from: "session", localField: "_id", foreignField: "_id", as: "Z"}}
    
        // We now have flds as a k-v array.  We know that Z[0] cotains both                                               
        // created_at and _id.  We seek to create a full k-v array that we can                                            
        // turn into the target object, so working the expression below "backwards"                                       
        // 1. Pull element 0 from the Z array                                                                             
        // 2. Turn that into a k-v array, e.g. [{k:_id,v:1},{k:created_at,v:02/01/2021}]                                  
        //    with $objectToArray.  Important: we pick up _id here.                                                       
        // 3. Concat the flds k-v array with the Xsession lookup k-v array                                                
        // 4. We now have a complete k-v representation of our data.  Use $arrayToObject                                  
        //    to turn (e.g.) {k:created_at,v:02/01/2021} into created_at:02/01/2021                                       
        // 5. Don't assign the object to a fld (like X).  Instead make that object the                                    
        //    new root.  newRoot is the only arg to $replaceRoot:                                                         
        ,{$replaceRoot: { newRoot:
            {$arrayToObject:
               {$concatArrays: [ "$flds", {$objectToArray: {$arrayElemAt: ["$Z",0]}} ] }}}}
    
    ]);
    

    Or if you want more control over the fields instead of picking up everything in the sesssion doc:

    c = db.answers.aggregate([
        {$group: {_id: "$sessionId", flds: {$push: {k: "$typeofField", v: "$value"}}}}
        ,{$lookup: {from: "session", localField: "_id", foreignField: "_id", as: "Z"}}
    
        // Don't want all the fields from the lookup?  No problem: wrap the                                               
        // $objectToArray with a filter and only let k = [_id,created_at,foo]                                             
        // or whatever else you want.  Make sure to always include _id.                                                   
        // Of course, if you want to exclude fields and keep the rest, just use                                           
        // the $not operator.  Be sure not to exclude _id; see commented cond below:                                      
        ,{$replaceRoot: {newRoot: {$arrayToObject: {$concatArrays: [ "$flds",
                         {$filter: {input: {$objectToArray: {$arrayElemAt: ["$Z",0]}},
                                       as: "z",
                                    cond: {$in: ["$$z.k", ["_id","created_at","foo"]]}
                                    //cond: {$not: {$in: ["$$z.k", ["foo"]]}}                                             
                                   }}
                                                                   ]
                                                   }}
                        }}
    ]);
    

    Depending on the amount of material looked up in session, you may want to use the fancier version of $lookup to filter the fields there:

    c = db.answers.aggregate([
        {$group: {_id: "$sessionId", flds: {$push: {k: "$typeofField", v: "$value"}}}}
        ,{$lookup: {from: "session",
                    let: { sid: "$_id" },
                    pipeline: [
                        {$match: {$expr: {$eq: [ "$_id", "$$sid" ]} }},
                        {$project: {"_id":true, "created_at":true,"foo":true}}
                    ],
                    as: "Z"
                   }}
        ,{$replaceRoot: { newRoot:
            {$arrayToObject:
               {$concatArrays: [ "$flds", {$objectToArray: {$arrayElemAt: ["$Z",0]}} ] }}}}
    ]);