Search code examples
javamongodbaggregation-frameworkspring-data-mongodbdbref

How to count distinct values of a reference collection in mongo


Having a list of books that points to a list of authors, I want to display a tree, having in each node the author name and the number of books he wrote. Initially, I have embedded the authors[] array directly into books collection, and this worked like a charm, using the magic of aggregation framework. However, later on, I realise that it would be nice to have some additional information attached to each author (e.g. it's picture, biographical data, birth date, etc). For the first solution, this is bad because:

  • it duplicates the data (not a big deal, and yes, I know that mongo's purpose is to encapsulate full objects, but let's ignore that for now);
  • whenever an additional property is created or updated on the old records won't benefit from this change, unless I specifically query for some unique old property and update all the book authors with the new/updated values.

The next thing was to use the second collection, called authors, and each books document is referencing a list of author ids, like this:

{
    "_id" : ObjectId("58ed2a254374473fced950c1"),
    "authors" : [ 
        "58ed2a254d74s73fced950c1", 
        "58ed2a234374473fce3950c1"
    ],
    "title" : "Book title"
....
}

For getting the author details, I have two options:

  • make an additional query to get the data from the author collection;
  • use DBRefs.

Questions:

  1. Using DBRefs automatically loads the authors data into the book object, similar to what JPA @MannyToOne does for instance?
  2. Is it possible to get the number of written books for each author, without having to query for each author's book count? When the authors were embedded, I was able to aggregate the distinct author name's and also the number of book documents that he was present on. Is such query possible between two collections?

What would be your recommendation for implementing this behaviour? (I am using Spring Data)


Solution

  • You can try the below query in the spring mongo application.

    UnwindOperation unwindAuthorIds = Aggregation.unwind("authorsIds", true);
    LookupOperation lookupAuthor = Aggregation.lookup("authors_collection", "authorsIds", "_id", "ref");
    UnwindOperation unwindRefs = Aggregation.unwind("ref", true);
    GroupOperation groupByAuthor = Aggregation.group("ref.authorName").count().as("count");
    
    Aggregation aggregation = Aggregation.newAggregation(unwindAuthorIds, lookupAuthor, unwindRefs, groupByAuthor);
    
    List<BasicDBObject> results = mongoOperations.aggregate(aggregation, "book_collection", BasicDBObject.class).getMappedResults();