Search code examples
arraysmongodbnosqlaggregation-frameworkunwind

why i didn't get The same result for these two different queries for Get actors who have appeared in at least 5 films


I'd like to retrieve a list of actors who have appeared in a minimum of 5 films using two different queries. In the first query, $unwind is not used, but within the $group stage, it matches every actor's details. This query aims to collect and count actors without breaking down the actors array.

In the second query, $unwind is utilized to deconstruct the table into separate objects, allowing us to analyze individual actors. While the end goal is to obtain a list of actors(Table of object) who have participated in at least 5 films, the use of $unwind makes the second query slightly different.

First Request

db.films.aggregate([
  {
        $unwind:'$actors'
  },
  {
    $group: {
      _id: {
        prenom:'$actors.first_name',
        nom:'$actors.last_name'
      },
      nbr_films: {
        $sum: 1
      }
    }
  },
  {
    $match: {
      nbr_films:{$gte:5}
    }
  },
  {
    $project: {
        _id:0,
        nom:"$_id.nom",
        nbr_films:1
    }
  }
]);

returns:

[
  {
    "nbr_films": 6,
    "nom": "Willis"
  },
  {
    "nbr_films": 5,
    "nom": "Oldman"
  },
  {
    "nbr_films": 6,
    "nom": "De Niro"
  },
  {
    "nbr_films": 5,
    "nom": "Caine"
  },
  {
    "nbr_films": 6,
    "nom": "Freeman"
  }
]

second request:

  $group: {
    _id: {
      prenom:'$actors.first_name',
      nom:'$actors.last_name'
    },
    nbr_films: {
      $sum: 1
    }
  }
},
{
  $match: {
    nbr_films:{$gte:5}
  }
},
{
  $project: {
      _id:0,
      nom:"$_id.nom",
      nbr_films:1


  }
}
]); ``` 
**Returns: 
**```
[
{
  "nbr_films": 5
}
]

Solution

  • First, I assume 'actors' is an array because the first query uses '$unwind' on it.

    From the rest of query, it would appear that the actors array contains an array of documents like {first_name:"Bruce",last_name:"Willis"}.

    Without unwinding, $actors.first_name will resolve to an array containing all of the first_name values from every object in the 'actors' array.

    For example, if the document looks like:

    { "actors": [
      { "first_name": "Bruce", "last_name": "Willis" },
      { "first_name": "Gary", "last_name": "Oldman" },
      { "first_name": "Robert", "last_name": "De Niro" },
      { "first_name": "Michael", "last_name": "Caine" },
      { "first_name": "Morgan", "last_name": "Freeman" }
    ]}
    

    Then "$actors.first_name" will resolve to ["Bruce", "Gary", "Robert", "Michael", "Morgan"] Example

    When $unwind is used, a new set of documents is created from each input document, each containing a single element from the array. Example

    In the first query, the actors array was unwound at the start, so the $group's _id contained the first and last name of a single actor, and the count would then be the number of movies that used that actor.

    In the second query, actors was not unwound, so the $group's _id contained an array of first_names and an array of last_names for all of the actors in that movie. The count would then be the number of movies that used the same actors, and whose names appeared in the same order in the actors array.