Search code examples
mongodbmany-to-manymodelingnosql

How should the following many to many relationship be modeled in MongoDB?


Suppose I have Student and Teacher in a many to many relationship. If I just want to find out all the teachers for a given student or vice versa I generally model it by using embedded Object Ids. For example if teacher has a property studentIds which is an array of student Object Ids then that is enough information to do all the queries you need.

However suppose that a student can give a teacher a rating. How should this rating fit into the model? At the moment I do the following:

  1. Inside teacher instead of storing an array of student, I store an array of json objects {studentId: ObjectId, rating: String}
  2. When doing the query, I transform the array of json objects into an array of studentIds and extract the full information as usual
  3. So now I have an array of student objects and an array of json objects with the ratings
  4. However, since the $in operator in MongoDB does not preserve ordering, I need to do my own sorting
  5. At the last step, with everything in order I can combine student objects with ratings to get what I want

It works but seems somewhat convoluted is there a better way of doing this?


Solution

  • Here are some considerations. In the end, it depends on your requirements:

    1. Rating is optional, right?

      If so, ask yourself whether you want to combine a required feature (storing teacher/student association) with a nice-to-have one. Code that implements a nice-to-have feature now writes to your most important collection. I think you can improve separation of concerns in your code with a different db schema.

    2. Will you need more features?

      Let's say you want to provide students with a list of ratings they gave, the average rating a student has given to teachers, and you want to show a development of ratings over time. This will be very messy with embedded documents. Embedded documents are less flexible.

    3. If you need top read performance, you need to denormalize more data

      If you want to stick to the embedded documents, you might want to copy more data. Let's say there's an overview of ratings per teacher where you can see the students' names. It would be helpful to embed an object

      { studentId : ObjectId, 
        rating: string, 
        studentName: string, 
        created: dateTime }
      

    As alternatives, consider

    TeacherRating {
        StudentId: id
        TeacherId: id
        Rating: number
        Created: DateTime
    }
    

    Teacher/student association will still be stored in the teacher object, but the ratings are in a different collections. A rating can't be created if no association between teacher and student can be found.

    or

    TeacherStudentClass {
        StudentId: id
        TeacherId: id
        Class: id
        ClassName: string // (denormalized, just an example)
        Rating: number // (optional)
        Created: DateTime
    }
    

    To find all students for a given teacher, you'd have to query the linker document first, then do a $in query on the students, and vice versa. That is one more query, but it comes with a huge gain in flexibility.