Search code examples
javascriptmongodbmongoose

MongoDB query: how to find if a string within a nested objects


I have the following Course Schema with an attendance field, which stores a dictionary with the following key-value structure (string: List). In particular, the attendance object's key stores a date of a class while the value stores a list of students who attended (student IDs).

var CourseSchema = new Schema({
  attendance: {},
});

For instance, a single Course item could be as follows:

{"_id":"559a7353186a384b54f9aea9","attendance":{"2015-12-17":["558febdb949eff4711e621e9","559020fe79f141941ddd3246"],"2015-11-14":["558febdb949eff4711e621e9","559020fe79f141941ddd3246"]}

I'm trying to return courses that has a student has attended (so basically return any course that in its attendance field contains a particular student ID. I'm trying to do it as follows, but it's returning empty results. Is there something wrong with my query?

I tried checking this link, but it doesn't work for me too. MongoDB query inside a nested array

Course.find({"attendance.$": {$in: ["558febdb949eff4711e621e9"]}}...)

Solution

  • The solution is sound, but your schema is at fault here. There are big problems with using "named keys" with a database like this, and you should consider them as "attributes" instead.

    Such as in:

    {
        "_id" : ObjectId("559a7353186a384b54f9aea9"),
        "attendance" : [
                {
                        "date" : ISODate("2015-12-17T00:00:00Z"),
                        "student" : ObjectId("558febdb949eff4711e621e9")
                },
                {
                        "date" : ISODate("2015-12-17T00:00:00Z"),
                        "student" : ObjectId("559020fe79f141941ddd3246")
                },
                {
                        "date" : ISODate("2015-11-14T00:00:00Z"),
                        "student" : ObjectId("558febdb949eff4711e621e9")
                },
                {
                        "date" : ISODate("2015-11-14T00:00:00Z"),
                        "student" : ObjectId("559020fe79f141941ddd3246")
                }
        ]
    }
    

    Which would be defined in a schema as:

    AttendanceSchmema = new Schema({
      "date": Date,
      "student": { "type": Schema.Types.ObjectId, "ref": "Student" }
    },{ "_id": false });
    
    CourseSchema = new Schema({
      "attendance": [AttendanceSchema]
    });
    

    Then I can simply match for as given student:

    Course.find(
      { "attendance.student": "559020fe79f141941ddd3246" },
      { "attendance.$": 1 }
    )
    

    Or even a student on a given date:

    Course.find(
      { "attendance": { 
         "$elemMatch": { 
           "student": "559020fe79f141941ddd3246",
           "date": { "$gte": new Date("2015-12-15"), "$lt": new Date("2015-12-16") }
         }
      }},
      { "attendance.$": 1 }
    )
    

    And even more possibilities in between. The reason we put all of this in a single array with "attribute" like properties is to make it simple to query and store.


    Without such a change, you are stuck with horribly performing queries like:

    Course.find({ "$where": function() {
      return Object.keys(this.attendance).some(function(k) {
        return this.attendance[k].indexOf("558febdb949eff4711e621e9") > -1
      });
    }})
    

    Presuming of course that those values are presently "strings", as they would appear to be by the Mixed type.

    And even worse queries for anything more complex, which also cannot tell you which position was matched. But by far the worst thing is these cannot possibly use an "index" in order to match.

    Therefore you really should consider changing the database schema here to a form that is going to support querying in the most efficient way.