Search code examples
databasemongodbnosqldatasetaggregation

MongoDB Aggregate Two Collection


I have two collections. Main - represented by larger mongo document, and second one is smaller. So, both of collections have at least one semanticity similar field - name and title_en. Is it possible to aggregate this two collection into one, using mongoDB aggregation? I guess in pseudo query it will something like:

APPEND to Collection1_DOC (field_name: field_value) from Collection2_DOC
WHERE Collection1_DOC.title_en = Collection2_DOC.Name

Is mongo aggregation provide this kind of functionality?

{
   "age_rating":"R",
   "age_rating_guide":"17+ (violence & profanity)",
   "average_rating":"82.47",
   "episode_count":26,
   "episode_length":25,
   "poster_image":"https://media.kitsu.io/anime/poster_images/1/original.jpg?1597604210",
   "show_type":"TV",
   "title_en":"Cowboy Bebop",
   "title_ja_jp":"カウボーイビバップ",
   "total_length":650
}
{
   "End_year":1999,
   "Name":"Cowboy Bebop",
   "Release_season":"Spring",
   "Release_year":1998,
   "Tags":"Action, Adventure, Drama, Sci Fi, Bounty Hunters, Episodic, Noir, Outer Space, Western, Original Work, Drug Use,, Mature Themes,, Nudity,, Violence"
}

Solution

  • I find out more about aggregation. I want to share pipeline stage, that kinda work for me.

    db.main.aggregate([
      {
        "$lookup": {
          "from": "info",
          "localField": "Name",
          "foreignField": "title_en",
          "as": "linked_collections"
        }
      },
      {
        "$unwind": "$linked_collections"
      },
      {
        "$project": {
          "age_rating": 1,
          "age_rating_guide": 1,
          "average_ratingr": 1,
          "description": 1,
          "episode_count": 1,
          "episode_length": 1,
          "poster_image": 1,
          "show_type": 1,
          "title_en": 1,
          "title_ja_jp": 1,
          "total_length": 1,
          "release_season": "$linked_collections.Release_season",
          "Release_year": "$linked_collections.Release_year",
          "Tags": "$linked_collections.Tags"
        }
      },
      {
        $out: "main_new"
      }
    ])
    

    Unfortunatly, $out stage exit with error. But its steel the right way, as i see

    caused by :: E11000 duplicate key error 
    collection: anime_app.tmp.agg_out.c0db80b8-2a40-40cb-84ec-5b8c3c281ca2 
    index: _id_ 
    dup key: { _id: ObjectId('62da48e463f67a0586394026') }
    

    Update This error can be removed by adding this one line of code. It also removed all duplicate. Attaching answer from MongoDB engineer

    {
      "_id":0,
      "age_rating": 1,
    }