Search code examples
mongodbmongodb-queryaggregate

How do I reverse referenced documents across two collections?


I have a person collection with documents like this:

_id: ObjectId('64103dd44c7b1355e4812e87'),
sport: 'Football',
name: 'Brian',
teams: [
    ObjectId('62f63b38e6a6cbb9a58c58c4'),
    ObjectId('64219e4eb5924e070a23f846')
]

    _id: ObjectId('642ad97d0c945ff64f2a5f4c'),
sport: 'Football',
name: 'Marc',
teams: [
    ObjectId('64219e4eb5924e070a23f846')
]

_id: ObjectId('64269d6c38364df7ca902d75'),
sport: 'Basketball',
name: 'Francis',
teams: [
    ObjectId('641453c9adc10762db6997cb'),
    ObjectId('64269ebc38364df7ca902e17')
]

Then I have this teams collection with documents like this:

_id: ObjectId('62f63b38e6a6cbb9a58c58c4'),
sport: 'Football',
name: 'Team A',
persons: [
    ObjectId('642ada46f785be5eb0042f63')
]

_id: ObjectId('64219e4eb5924e070a23f846'),
sport: 'Football',
name: 'Team B',
persons: [
]

_id: ObjectId('62f3a57bcc852536a665b179'),
sport: 'Football',
name: 'Team C',
persons: [
]

_id: ObjectId('641453c9adc10762db6997cb'),
sport: 'Basketball',
name: 'Team D',
persons: [
    ObjectId('642ada848b8c7b48102c3cb4')
]

_id: ObjectId('64269ebc38364df7ca902e17'),
sport: 'Basketball',
name: 'Team E',
persons: [
    ObjectId('642ada95f8f190e02582e10d')
]

Right now my teams are referenced in the person documents, and I'm trying to create a script that basically reverses it, so that the persons are referenced in the team documents instead.

Keep in mind that there is already some existing references in the team documents which should be kept. And also the each migration should not be done across sports: A person that has the sport 'football' should not be migrated to a team that is other than 'football' for example..

The result should look like this:

person

_id: ObjectId('64103dd44c7b1355e4812e87'),
sport: 'Football',
name: 'Brian'

    _id: ObjectId('642ad97d0c945ff64f2a5f4c'),
sport: 'Football',
name: 'Marc'

_id: ObjectId('64269d6c38364df7ca902d75'),
sport: 'Basketball',
name: 'Francis'

team

_id: ObjectId('62f63b38e6a6cbb9a58c58c4'),
sport: 'Football',
name: 'Team A',
persons: [
    ObjectId('62f641ae84f995495a0f567f')
]

_id: ObjectId('64219e4eb5924e070a23f846'),
sport: 'Football',
name: 'Team B',
persons: [
    ObjectId('642ada46f785be5eb0042f63'),
    ObjectId('64103dd44c7b1355e4812e87'),
    ObjectId('642ad97d0c945ff64f2a5f4c')
]

_id: ObjectId('62f3a57bcc852536a665b179'),
sport: 'Football',
name: 'Team C',
persons: [
]

_id: ObjectId('641453c9adc10762db6997cb'),
sport: 'Basketball',
name: 'Team D',
persons: [
    ObjectId('642ada848b8c7b48102c3cb4'),
    ObjectId('64269d6c38364df7ca902d75')
]

_id: ObjectId('64269ebc38364df7ca902e17'),
sport: 'Basketball',
name: 'Team E',
persons: [
    ObjectId('642ada95f8f190e02582e10d'),
    ObjectId('64269d6c38364df7ca902d75')
]

How do I achieve this?


Solution

  • For the teams collection you can run a query like:

    db.teams.aggregate([
      {$lookup: {
          from: "persons",
          localField: "_id",
          foreignField: "teams",
          pipeline: [{$project: {_id: 1}}],
          as: "moreP"
      }},
      {$project: {
          persons: {$concatArrays: [
              "$persons",
              {$map: {
                  input: "$moreP",
                  in: "$$this._id"
              }}
          ]},
          name: 1,
          sport: 1
      }},
      {$merge: {into: "teams"}}
    ])
    

    See how it works on the playground example

    Afterwards: You can update the persons collection to unset the teams key:

    db.persons.updateMany({}, {$unset: {teams: ""}})