I am struggling with mongoDB pipelines. I am working on a MERN stack that processes data.
We ask people questions with a form, a form is described as a session, e.g. one session per people. Each session is recorded in a table like that :
{ _id: 1, created_at:"01/01/2021"}
{ _id: 2, created_at:"02/01/2021"}
{ _id: 3, created_at:"03/01/2021"}
All of their answers are stored in a table like that the foreign key being sessionId:
{ _id: 1, value:"Name1", sessionId : 1, typeofField :"name"}
{ _id: 2, value:"Firstname1", sessionId : 1, typeofField :"firstname"}
{ _id: 3, value:"Date of birth1", sessionId : 1, typeofField :"birthdate"}
{ _id: 4, value:"Name2", sessionId : 2, typeofField :"name"}
{ _id: 5, value:"Firstname2", sessionId : 2, typeofField :"firstname"}
{ _id: 6, value:"Date of birth2", sessionId : 2, typeofField :"birthdate"}
How can I project this data to have every informations of a session in order like this :
{id :1, created_at:"01/01/2021", name : "Name1", firstname: "Firstname1", birthdate : "Date of Birth1"}
{id :2, created_at:"02/01/2021", name : "Name2", firstname: "Firstname2", birthdate : "Date of Birth2"}
An alternate solution, going in the other direction (from answers to session):
c = db.answers.aggregate([
// Bring all answers together as a k-v array:
{$group: {_id: "$sessionId", flds: {$push: {k: "$typeofField", v: "$value"}}}}
// Do a 1:1 lookup:
,{$lookup: {from: "session", localField: "_id", foreignField: "_id", as: "Z"}}
// We now have flds as a k-v array. We know that Z[0] cotains both
// created_at and _id. We seek to create a full k-v array that we can
// turn into the target object, so working the expression below "backwards"
// 1. Pull element 0 from the Z array
// 2. Turn that into a k-v array, e.g. [{k:_id,v:1},{k:created_at,v:02/01/2021}]
// with $objectToArray. Important: we pick up _id here.
// 3. Concat the flds k-v array with the Xsession lookup k-v array
// 4. We now have a complete k-v representation of our data. Use $arrayToObject
// to turn (e.g.) {k:created_at,v:02/01/2021} into created_at:02/01/2021
// 5. Don't assign the object to a fld (like X). Instead make that object the
// new root. newRoot is the only arg to $replaceRoot:
,{$replaceRoot: { newRoot:
{$arrayToObject:
{$concatArrays: [ "$flds", {$objectToArray: {$arrayElemAt: ["$Z",0]}} ] }}}}
]);
Or if you want more control over the fields instead of picking up everything in the sesssion
doc:
c = db.answers.aggregate([
{$group: {_id: "$sessionId", flds: {$push: {k: "$typeofField", v: "$value"}}}}
,{$lookup: {from: "session", localField: "_id", foreignField: "_id", as: "Z"}}
// Don't want all the fields from the lookup? No problem: wrap the
// $objectToArray with a filter and only let k = [_id,created_at,foo]
// or whatever else you want. Make sure to always include _id.
// Of course, if you want to exclude fields and keep the rest, just use
// the $not operator. Be sure not to exclude _id; see commented cond below:
,{$replaceRoot: {newRoot: {$arrayToObject: {$concatArrays: [ "$flds",
{$filter: {input: {$objectToArray: {$arrayElemAt: ["$Z",0]}},
as: "z",
cond: {$in: ["$$z.k", ["_id","created_at","foo"]]}
//cond: {$not: {$in: ["$$z.k", ["foo"]]}}
}}
]
}}
}}
]);
Depending on the amount of material looked up in session
, you may want to use the fancier version of $lookup
to filter the fields there:
c = db.answers.aggregate([
{$group: {_id: "$sessionId", flds: {$push: {k: "$typeofField", v: "$value"}}}}
,{$lookup: {from: "session",
let: { sid: "$_id" },
pipeline: [
{$match: {$expr: {$eq: [ "$_id", "$$sid" ]} }},
{$project: {"_id":true, "created_at":true,"foo":true}}
],
as: "Z"
}}
,{$replaceRoot: { newRoot:
{$arrayToObject:
{$concatArrays: [ "$flds", {$objectToArray: {$arrayElemAt: ["$Z",0]}} ] }}}}
]);