Search code examples
mongodbnosqlaggregation-frameworknosql-aggregation

How can I perform nested "joins" (joining 3 or more collections) in a MongoDB aggregation pipeline?


Let's say we have 3 hypothetical collections in MongoDB: customers, orders, and orderItems.

Each customer has multiple orders, and each order has multiple order items.

Here's some sample data for these 3 collections:

customers

[
    {
        customer_id: 1,
        name: "Jim Smith",
        email: "jim.smith@example.com"
    },
    {
        customer_id: 2,
        name: "Bob Jones",
        email: "bob.jones@example.com"
    }
]

orders

[
    {
        order_id: 1,
        customer_id: 1
    },
    {
        order_id: 2,
        customer_id: 1
    }
]

orderItems

[
    {
        order_item_id: 1,
        name: "Foo",
        price: 4.99,
        order_id: 1
    },
    {
        order_item_id: 2,
        name: "Bar",
        price: 17.99,
        order_id: 1
    },
    {
        order_item_id: 3,
        name: "baz",
        price: 24.99,
        order_id: 2
    }
]

Desired Result

How can I write my aggregation pipeline so that the result returned looks something like this?

[
    {
        customer_id: 1,
        name: "Jim Smith",
        email: "jim.smith@example.com"
        orders: [
            {
                order_id: 1,
                items: [
                    {
                        name: "Foo",
                        price: 4.99
                    },
                    {
                        name: "Bar",
                        price: 17.99
                    }
                ]
            },
            {
                order_id: 2,
                items: [
                    {
                        name: "baz",
                        price: 24.99
                    }
                ]
            }
        ]
    },
    {
        customer_id: 2,
        name: "Bob Jones",
        email: "bob.jones@example.com"
        orders: []
    }
]

Solution

  • Do nested lookup using lookup with pipeline,

    • $lookup with orders collection,
      • let, define variable customer_id that is from main collection, to access this reference variable inside pipeline using $$ like $$customer_id,
      • pipeline can add pipeline stages same as we do in root level pipeline
      • $expr whenever we match internal fields it requires expression match condition, so $$customer_id is parent collection field that declared in let and $customer_id is child collection's/current collection's field
    • $lookup with orderitems collection
    db.customers.aggregate([
      {
        $lookup: {
          from: "orders",
          let: { customer_id: "$customer_id" },
          pipeline: [
            { $match: { $expr: { $eq: ["$$customer_id", "$customer_id"] } } },
            {
              $lookup: {
                from: "orderitems",
                localField: "order_id",
                foreignField: "order_id",
                as: "items"
              }
            }
          ],
          as: "orders"
        }
      }
    ])
    

    Playground


    Tip:

    Several joins considered as bad practice in NoSQL, I would suggest if you could add your order items in orders collection as array, you can save one join process for orderitems, see improved version in playground