Search code examples
djangomongodbmongodb-queryaggregation-frameworkpymongo

MongoDB Aggregate - Match condition only on second collection but provide all documents from the first


Basically, I have 2 collections in my MongoDB database -> Books, Scores.

Books

{
    "BOOK_ID" : "100",
    "BOOK_NAME" : "Book 1",
    "BOOK_DESC" : "abcd",
},
{
    "BOOK_ID" : "101",
    "BOOK_NAME" : "Book 2",
    "BOOK_DESC" : "efgh",
},
{
    "BOOK_ID" : "102",
    "BOOK_NAME" : "Book 3",
    "BOOK_DESC" : "ijkl",
}

Scores

{
    "BOOK_ID" : "100",
    "BOOK_CATEGORY" : "kids",
    "BOOK_SCORE" : "6",
},
{
    "BOOK_ID" : "100",
    "BOOK_CATEGORY" : "Educational",
    "BOOK_SCORE" : "8",
},
{
    "BOOK_ID" : "101",
    "BOOK_CATEGORY" : "Kids",
    "BOOK_SCORE" : "6",
},
{
    "BOOK_ID" : "101",
    "BOOK_CATEGORY" : "Fantasy",
    "BOOK_SCORE" : "7",
}

Expected output : Searching for all books with BOOKS_CATEGORY="Kids" and `BOOKS_SCORE=6``

{
    "BOOK_ID" : "100",
    "BOOK_NAME" : "Book 1",
    "BOOK_DESC" : "abcd",
    "BOOK_CATEGORY" : "Kids",
    "BOOK_SCORE" : 6
},
{
    "BOOK_ID" : "101",
    "BOOK_NAME" : "Book 2",
    "BOOK_DESC" : "efgh",
    "BOOK_CATEGORY" : "Kids",
    "BOOK_SCORE" : 6
},
{
    "BOOK_ID" : "102",
    "BOOK_NAME" : "Book 3",
    "BOOK_DESC" : "ijkl",
}

Notice that, for all the books to which scores are available, they are appended. If a Book does not have any score associated, it still comes in the result.

What I have tried?

I have tried using $lookup

pipeline = [
                {
                    "$lookup": {
                    "from": "Scores",
                    "pipeline":[
                        {
                            "$match" : {
                                "BOOK_CATEGORY" : "Kids",
                                "BOOK_SCORE" : "6",
                            }
                        }
                    ],
                    "localField": "BOOK_ID",
                    "foreignField": "BOOK_ID",
                    "as": "SCORES", 

                    },
                },
            ]
db.Books.aggregate(pipeline)

Also, by reading the $lookup subquery docs,(https://www.mongodb.com/docs/manual/reference/operator/aggregation/lookup/#join-conditions-and-subqueries-on-a-joined-collection) I got the feeling that what I am expecting may not be possible. Can anyone help me with executing such query? (I use PyMongo btw)


Solution

  • For the last two stages:

    1. $replaceRoot - Replace the input document(s) with the new document(s) by merging the current document with the document which is the first document from the SCORES array.

    2. $unset - Remove SCORES array.

    db.Books.aggregate([
      {
        "$lookup": {
          "from": "Scores",
          "pipeline": [
            {
              "$match": {
                "BOOK_CATEGORY": "Kids",
                "BOOK_SCORE": "6",
                
              }
            }
          ],
          "localField": "BOOK_ID",
          "foreignField": "BOOK_ID",
          "as": "SCORES"
        }
      },
      {
        "$replaceRoot": {
          "newRoot": {
            "$mergeObjects": [
              "$$ROOT",
              {
                $first: "$$ROOT.SCORES"
              }
            ]
          }
        }
      },
      {
        $unset: "SCORES"
      }
    ])
    

    Sample Mongo Playground