Search code examples
mongodbmongoosemongodb-queryaggregation-framework

mongodb $lookup operator based on key name


Having two collections tables and table_rows like following:

tables = {
  "_id": "641ce65852a7ccd2f4a7b298",
  "name": "table name",
  "description": "table description",
  "columns": [{
    "_id": "641ce65852a7ccd2f4a7b299",
    "name": "column 1",
    "dataType": "String"
  }, {
    "_id": "641cf95543a5f258bfaf69e3",
    "name": "column 2",
    "dataType": "Number"
  }]
}

table_rows = {
  "tableId": "641ce65852a7ccd2f4a7b298",
  "641ce65852a7ccd2f4a7b299": "Example string",
  "641cf95543a5f258bfaf69e3": 101
}

That in a nutshell represent regular tabular data of 2 columns and 1 row like this:

column 1 column 2
Example string 101

Is there any way to use $lookup operator to join those two collections based on _id from columns array in tables collection and based on the key name from table_rows collection? What i'm trying is to somehow join columns definitions (name, datatype, etc) along with cell values.

As you can see actual key name in card_rows collection is _id of column itself.

Ideally this would be single collection, but those tables can grow to hundred of columns and 10K of rows, so it is modeled as two collections to avoid unbound arrays in mongo.


Solution

  • One option is to start from the table_rows:

    1. $match the relevant documents using the tableId, $unwind the "rows" and $group by columns
    2. Use $lookup with pipeline to get the column name for each column document
    3. Format
    db.table_rows.aggregate([
      {$match: {tableId: tableId}},
      {$unset: ["tableId", "_id"]},
      {$project: {rows: {$objectToArray: "$$ROOT"}}},
      {$unwind: "$rows"},
      {$group: {_id: "$rows.k", values: {$push: "$rows.v"}}},
      {$lookup: {
          from: "tables",
          let: {tableId: tableId, colId: "$_id"},
          pipeline: [
            {$match: {$expr: {$eq: ["$_id", "$$tableId"]}}},
            {$project: {
                _id: 0,
                coldata: {
                  $first: {$filter: {
                      input: "$columns",
                      cond: {$eq: ["$$colId", "$$this._id"]}
                  }}
                }
            }}
          ],
          as: "colName"
      }},
      {$project: {
          colName: {$first: "$colName.coldata.name"},
          values: 1,
          _id: 0
      }}
    ])
    

    See how it works on the playground example

    *If you want to avoid unbounded arrays on the query, you can drop the $group step...