Search code examples
mongodbsortingaggregation-frameworkprojection

MongoDB Aggregation Sort Differently Depending on Projection


I have found an odd sorting behavior from MongoDB as I completed the MongDB Course M121.

You can test out the collection in this cluster with

mongo mongodb://cluster0-shard-00-00-jxeqq.mongodb.net:27017,cluster0-shard-00-01-jxeqq.mongodb.net:27017,cluster0-shard-00-02-jxeqq.mongodb.net:27017/aggregations?replicaSet=Cluster0-shard-0" --authenticationDatabase admin --ssl -u m121 -p aggregations --norc

When I run the following the aggregate sort:

var favorites = [
  "Sandra Bullock",
  "Tom Hanks",
  "Julia Roberts",
  "Kevin Spacey",
  "George Clooney"]

db.movies.aggregate([
  {
    $match: {
      "tomatoes.viewer.rating": { $gte: 3 },
      countries: "USA",
      cast: {
        $in: favorites
      }
    }
  },
  {
    $project: {
      _id: 0,
      title: 1,
      "tomatoes.viewer.rating": 1,
      num_favs: {
        $size: {
          $setIntersection: [
            "$cast",
            favorites
          ]
        }
      }
    }
  },
  {
    $sort: { num_favs: -1, "tomatoes.viewer.rating": -1, title: -1 }
  },
  {
    $limit: 10
  }
])

I'll get the following result:

{ "title" : "Gravity", "tomatoes" : { "viewer" : { "rating" : 4 } }, "num_favs" : 2 }
{ "title" : "A Time to Kill", "tomatoes" : { "viewer" : { "rating" : 3.6 } }, "num_favs" : 2 }
{ "title" : "Extremely Loud & Incredibly Close", "tomatoes" : { "viewer" : { "rating" : 3.5 } }, "num_favs" : 2 }
{ "title" : "Charlie Wilson's War", "tomatoes" : { "viewer" : { "rating" : 3.5 } }, "num_favs" : 2 }
{ "title" : "The Men Who Stare at Goats", "tomatoes" : { "viewer" : { "rating" : 3 } }, "num_favs" : 2 }

But if I change the projection slightly, from:

    $project: {
      _id: 0,
      title: 1,
      "tomatoes.viewer.rating": 1,

To

    $project: {
      _id: 0,
      title: 1,
      rating: "$tomatoes.viewer.rating",

or if I rid of the rating all together:

    $project: {
      _id: 0,
      title: 1,

The resulting sort would also change:

{ "title" : "The Men Who Stare at Goats", "rating" : 3, "num_favs" : 2 }
{ "title" : "Gravity", "rating" : 4, "num_favs" : 2 }
{ "title" : "Extremely Loud & Incredibly Close", "rating" : 3.5, "num_favs" : 2 }
{ "title" : "Charlie Wilson's War", "rating" : 3.5, "num_favs" : 2 }
{ "title" : "A Time to Kill", "rating" : 3.6, "num_favs" : 2 }

Notice how the movie Gravity is no longer the top result.

Would anyone understand why sorting would change based on the projection? I was not expecting the projection to cause any changes to the sorting.


Solution

  • Since the projection did not include the tomatoes.viewer.rating, the sort() would assume it is null and then sort the field based on null instead of actual values.

    To solve, simply include the field in the projection to avoid this behavior.