Search code examples
mongodbpipelineprojection

How can I reduce the duplicate data from my aggregation pipeline?


I have a pipeline that works great for what I need... but I think there is some redundant data that can be removed from the pipeline.

Expected output

This is what I want the output to look like

{
  "_id": "5ecee2189fdd1b0004056936",
  "name": "Mike",
  "history": [
    {
        "_id": "5ecb263c166b8500047c1411",
        "what": "Log IN"
    },
    {
        "_id": "5ecb263c166b8500047c1422",
        "what": "Log OUT"
    }
  ]
}

Current output

This is what the output currently looks like

{
  "docs": [
    {
      "_id": "5ecee2189fdd1b0004056936",
      "name": "Mike",
      "history": {
        "_id": "5ecb263c166b8500047c1411",
        "what": "Log IN"
      },
      "historyIndex": 0
    },
    {
      "_id": "5ecee2189fdd1b0004056936",
      "name": "Mike",
      "history": {
        "_id": "5ecb263c166b8500047c1422",
        "what": "Log OUT"
      },
      "historyIndex": 1
    }
  ]
}

User doc

In real life there will be more users than this... of course...

{
  "_id": "5ecee2189fdd1b0004056936",
  "name": "Mike",
}

History docs

again, to make it simple, I am keeping data short

[
  {
    "_id": "5ecb263c166b8500047c1411",
    "userId": "5ecee2189fdd1b0004056936",
    "what": "Log IN"
  },
  {
    "_id": "5ecb263c166b8500047c1422",
    "userId": "5ecee2189fdd1b0004056999",
    "what": "Log IN"
  },
  {
    "_id": "5ecb263c166b8500047c1433",
    "userId": "5ecee2189fdd1b0004056936",
    "what": "Log OUT"
  },
  {
    "_id": "5ecb263c166b8500047c1444",
    "userId": "5ecee2189fdd1b0004056999",
    "what": "Log OUT"
  }
]

mongoose-aggregate-paginate-v2 middleware

I am also using mongoose-aggregate-paginate-v2, but I don't think that is my issue, but it definitely comes into play when the results are returned. it needs to have the docs flattened so it can count and paginate them:

    "totalDocs": 941,
    "limit": 500,
    "page": 1,
    "totalPages": 2,
    "pagingCounter": 1,
    "hasPrevPage": false,
    "hasNextPage": true,
    "prevPage": null,
    "nextPage": 2

Pipeline

Here is my pipeline

        var agg_match = {
            $match: 
            {
                _id: mongoose.Types.ObjectId(userId)
            }
        };

        var agg_lookup = {
            $lookup: {
                from: 'it_userhistories',
                localField: '_id',
                foreignField: 'userId',
                as: 'history'
            }
        }

        var agg_unwind = {
            $unwind: {
                path: "$history",
                preserveNullAndEmptyArrays: true,
                includeArrayIndex: 'historyIndex',
            }
        }

        var agg = [
            agg_match,
            agg_lookup,
            agg_unwind,
            agg_project,
        ];


        var pageAndLimit = {
            page:page,
            limit:limit
        }


       User.aggregatePaginate(myAggregate, pageAndLimit)

Solution

  • You can use $map operator to do this. Following query will be helpful (I have not included the match stage in the pipeline, you can easily include it):

    db.user.aggregate([
      {
        $lookup: {
          from: "history",
          localField: "_id",
          foreignField: "userId",
          as: "history"
        }
      },
      {
        $project: {
          name: 1,
          history: {
            $map: {
              input: "$history",
              as: "h",
              in: {
                _id: "$$h._id",
                what: "$$h.what"
              }
            }
          }
        }
      }
    ])
    

    MongoPLayGroundLink