Search code examples
mongodbmongoidmongodb-ruby

Mongoid: retrieving documents whose _id exists in another collection


I am trying to fetch the documents from a collection based on the existence of a reference to these documents in another collection.

Let's say I have two collections Users and Courses and the models look like this:

  • User: {_id, name}
  • Course: {_id, name, user_id}

Note: this just a hypothetical example and not actual use case. So let's assume that duplicates are fine in the name field of Course. Let's thin Course as CourseRegistrations.

Here, I am maintaining a reference to User in the Course with the user_id holding the _Id of User. And note that its stored as a string.

Now I want to retrieve all users who are registered to a particular set of courses.

I know that it can be done with two queries. That is first run a query and get the users_id field from the Course collection for the set of courses. Then query the User collection by using $in and the user ids retrieved in the previous query. But this may not be good if the number of documents are in tens of thousands or more.

Is there a better way to do this in just one query?


Solution

  • What you are saying is a typical sql join. But thats not possible in mongodb. As you suggested already you can do that in 2 different queries.

    There is one more way to handle it. Its not exactly a solution, but the valid workaround in NonSql databases. That is to store most frequently accessed fields inside the same collection.

    You can store the some of the user collection fields, inside the course collection as embedded field.

    Course : {
       _id : 'xx',
       name: 'yy'
       user:{
           fname : 'r',
           lname :'v',
           pic: 's'
       }  
    }
    

    This is a good approach if the subset of fields you intend to retrieve from user collection is less. You might be wondering the redundant user data stored in course collection, but that's exactly what makes mongodb powerful. Its a one time insert but your queries will be lot faster.