Search code examples
node.jsmongodbmongodb-queryaggregation-frameworkpymongo

Mongo db - how to join and sort two collection with pagination


I have 2 collections:

  1. Office -
{
  _id: ObjectId(someOfficeId),
  name: "some name",
  ..other fields
}
  1. Documents -
{
  _id: ObjectId(SomeId),
  name: "Some document name",
  officeId: ObjectId(someOfficeId),
  ...etc
}

I need to get list of offices sorted by count of documetns that refer to office. Also should be realized pagination.

I tryied to do this by aggregation and using $lookup

const aggregation = [
        {
            $lookup: {
                from: 'documents',
                let: {
                    id: '$id'
                },
                pipeline: [
                    {
                        $match: {
                            $expr: {
                                $eq: ['$officeId', '$id']
                            },
                            // sent_at: {
                            //     $gte: start,
                            //     $lt: end,
                            // },
                        }
                    }
                ],
                as: 'documents'
            },
        },
        { $sortByCount:  "$documents" },
        { $skip: (page - 1) * limit },
        { $limit: limit },
    ];

But this doesn't work for me

Any Ideas how to realize this?

p.s. I need to show offices with 0 documents, so get offices by documets - doesn't work for me


Solution

  • Query

    • you can use lookup to join on that field, and pipeline to group so you count the documents of each office (instead of putting the documents into an array, because you only case for the count)
    • $set is to get that count at top level field
    • sort using the noffices field
    • you can use the skip/limit way for pagination, but if your collection is very big it will be slow see this. Alternative you can do the pagination using the _id natural order, or retrieve more document in each query and have them in memory (instead of retriving just 1 page's documents)

    Test code here

    offices.aggregate(
    [{"$lookup":
      {"from":"documents",
       "localField":"_id",
       "foreignField":"officeId",
       "pipeline":[{"$group":{"_id":null, "count":{"$sum":1}}}],
       "as":"noffices"}},
     {"$set":
      {"noffices":
       {"$cond":
        [{"$eq":["$noffices", []]}, 0,
         {"$arrayElemAt":["$noffices.count", 0]}]}}},
     {"$sort":{"noffices":-1}}])
    

    As the other answer pointed out you forgot the _ of id, but you don't need the let or match inside the pipeline with $expr, with the above lookup. Also $sortByCount doesn't count the member of an array, you would need $size (sort by count is just group and count its not for arrays). But you dont need $size also you can count them in the pipeline, like above.


    Edit

    Query

    • you can add in the pipeline what you need or just remove it
    • this keeps all documents, and counts the array size
    • and then sorts

    Test code here

    offices.aggregate(
    [{"$lookup":
      {"from":"documents",
       "localField":"_id",
       "foreignField":"officeId",
       "pipeline":[],
       "as":"alldocuments"}},
     {"$set":{"ndocuments":{"$size":"$alldocuments"}}},
     {"$sort":{"ndocuments":-1}}])