Search code examples
mongodbmongodb-queryaws-documentdb

DocumentDB how to add an array of all documents sharing same value


I have a collection with documents following this structure:

{
  _id: 11,
  tx_id: "a",
  amount: 12,
  order_id: 1
},
{
  _id: 22,
  tx_id: "b",
  amount: 11,
  order_id: 1
},
{
  _id: 33,
  tx_id: "c",
  amount: 4,
  order_id: 2
}

And I want to add a field relatedTxs with all the transactions sharing the same order_id as the parent, something like:

{
  _id: 11,
  tx_id: "a",
  amount: 12,
  order_id: 1
  relatedTxs: [
    {
      _id: 11,
      tx_id: "a",
      amount: 12,
      order_id: 1
    },
    {
      _id: 22,
      tx_id: "b",
      amount: 11,
      order_id: 1
    }
  ]
},
{
  _id: 22,
  tx_id: "b",
  amount: 11,
  order_id: 1
  relatedTxs: [
    {
      _id: 11,
      tx_id: "a",
      amount: 12,
      order_id: 1
    },
    {
      _id: 22,
      tx_id: "b",
      amount: 11,
      order_id: 1
    }
  ]
},
{
  _id: 33,
  tx_id: "c",
  amount: 4,
  order_id: 2
  relatedTxs: [
    {
      _id: 33,
      tx_id: "c",
      amount: 4,
      order_id: 2
    }
  ]
}

The order of the related transactions array doesn't matter and including the parent is optional. Even the format is not required to be that one, I just need the whole document info and an array inside with all the info about the transactions sharing the same order_id.

I managed to accomplish that with a $lookup to the same collection, but the performance is really poor.

The order_id field has an index, and the database is DocumentDB.

Is there any way to do the same thing without a lookup? Maybe using $group? Thank you very much!


Solution

  • Query1

    • you can do a self-lookup with pipeline
    • pipeline filter to keep only those with same order_id and then groups to put them inside an array
    • the last $set is to fix the structure to get the first element from the lookup array and the field, to make them not-nested like in your expected output
    • to make it faster create an index on order_id

    Playmongo

    coll.aggregate(
    [{"$lookup": 
       {"from": "coll",
        "pipeline": 
         [{"$match": {"$expr": {"$eq": ["$order_id", "$$order_id"]}}},
           {"$group": {"_id": "$order_id", "relatedTxs": {"$push": "$$ROOT"}}}],
        "as": "relatedTxs",
        "let": {"order_id": "$order_id"}}},
     {"$set": 
       {"relatedTxs": 
         {"$getField": 
           {"field": "relatedTxs", "input": {"$first": "$relatedTxs"}}}}}])
    

    Query2 (without pipeline $lookup)

    • group by order_id and collect both the documents
    • duplicate the array in a temp-array
    • unwind
    • replace the root with each unwinded document

    Playmongo

    coll.aggregate(
    [{"$group": {"_id": "$order_id", "orders-temp": {"$push": "$$ROOT"}}},
     {"$set": {"orders": "$orders-temp"}}, {"$unwind": "$orders-temp"},
     {"$replaceRoot": 
       {"newRoot": {"$mergeObjects": ["$orders-temp", "$$ROOT"]}}},
     {"$unset": ["orders-temp", "_id"]}])