Search code examples
mongodbgomgo

Go - mgo, retrieve all nested fields from the collection


I have a db structure defined in a following way.

{
    name: "Jane",
    films: [
        {
            title: "The Shawshank Redemption",
            year: "1994"
        },
        {
            title: "The Godfather",
            year: "1972"
        }
    ]
},
{
    name: "Jack",
    films: [
        {
            title: "12 Angry Men",
            year: "1957"
        },
        {
            title: "The Dark Knight",
            year: "2008"
        }
    ]
}

I want to return a slice of all films - []Film and, if possible, in another query a slice of all titles - []string from the collection. I can pull out entire collection and extract relevant data in the application logic, but is it possible to achieve within a query? I've tried to work with Select() method, something like this: c.Find(nil).Select(<various conditions>).All(&results) but I wasn't successful.


Solution

  • I think this is one of the most popular questions on mongo tag. And I must say if you need that you are doing something wrong and maybe you should use RDBMS instead of Mongo as performance drop on this kind of queries will nullify all profit from Mongo's features like schemaless, "all-in-one" documents, etc..

    Anyway, the answer is simple - you can't get films list as you want. Mongo's find can return only complete or partial top-level documents. I mean the best result you can get with the db.collection.find({}, {'films': 1}) query is the list like

    {
        films: [
            {
                title: "The Shawshank Redemption",
                year: 1994
            },
            {
                title: "The Godfather",
                year: 1972
            }
        ]
    },
    {
        films: [
            {
                title: "12 Angry Men",
                year: 1957
            },
            {
                title: "The Dark Knight",
                year: 2008
            }
        ]
    }
    

    Not what you are expectied, right?

    The only way to get array like

    {
        title: "The Shawshank Redemption",
        year: 1994
    },
    {
        title: "The Godfather",
        year: 1972
    },
    {
        title: "12 Angry Men",
        year: 1957
    },
    {
        title: "The Dark Knight",
        year: 2008
    }
    

    is to use aggregation.

    The basic Mongo query to retrieve array of films is

    db.collection.aggregate([{
        $unwind: '$films'
    }, {
        $project: {
            title: '$films.title',
            year: '$films.year'
        }
    }])
    

    Go's code for this query is

    package main
    
    import (
        "gopkg.in/mgo.v2"
        "gopkg.in/mgo.v2/bson"
        "fmt"
    )
    
    func main() {
        session, err := mgo.Dial("mongodb://127.0.0.1:27017/db")
    
        if err != nil {
            panic(err)
        }
        defer session.Close()
        session.SetMode(mgo.Monotonic, true)
    
        c := session.DB("db").C("collection")
    
        pipe := c.Pipe(
            []bson.M{
                bson.M{
                    "$unwind": "$films",
                },
                bson.M{
                    "$project": bson.M{
                        "title": "$films.title",
                        "year": "$films.year",
                    },
                },
            },
        )
        result := []bson.M{}
        err = pipe.All(&result)
        fmt.Printf("%+v", result) // [map[_id:ObjectIdHex("57a2ed6640ce01187e1c9164") title:The Shawshank Redemption year:1994] map[_id:ObjectIdHex("57a2ed6640ce01187e1c9164") title:The Godfather year:1972] map[_id:ObjectIdHex("57a2ed6f40ce01187e1c9165") title:12 Angry Men year:1957] map[year:2008 _id:ObjectIdHex("57a2ed6f40ce01187e1c9165") title:The Dark Knight]]
    }
    

    If you need additional conditions to select top-level documents code would be

    pipe := c.Pipe(
        []bson.M{
            bson.M{
                "$match": bson.M{
                    "name": "Jane",
                },
            },
            bson.M{
                "$unwind": "$films",
            },
            bson.M{
                "$project": bson.M{
                    "title": "$films.title",
                    "year": "$films.year",
                },
            },
        },
    )
    // result [map[_id:ObjectIdHex("57a2ed6640ce01187e1c9164") title:The Shawshank Redemption year:1994] map[title:The Godfather year:1972 _id:ObjectIdHex("57a2ed6640ce01187e1c9164")]]
    

    And if you need to filter films you can use next query

    pipe := c.Pipe(
        []bson.M{
            bson.M{
                "$unwind": "$films",
            },
            bson.M{
                "$project": bson.M{
                    "title": "$films.title",
                    "year": "$films.year",
                },
            },
            bson.M{
                "$match": bson.M{
                    "year": bson.M{
                        "$gt": 2000,
                    },
                },
            },
        },
    )
    // result [map[_id:ObjectIdHex("57a2ed6f40ce01187e1c9165") year:2008 title:The Dark Knight]]
    

    The problem with aggregation is the simple fact that most part of aggregation operations doesn't use the indexes and could be slow on large collections. That's why I suggested you to think about RDBMS which can be better choice if you need a lot of aggregations.

    And there is no way to get []string from mgo as it always returns bson.M (or []bson.M) which is map[string]interface{}.