Search code examples
mongodbsymfonydoctrine-mongodb

How sort an array in a collection


I've been looking for sorting a specific inner array in a collection, I use doctrine-mongodb-bundle in symfony2. My collection :

"page":
{
    "_id": "56rgt46rt54h68rt4h6"
    "categories":
    [{
        "_id": "2g56rt1h65rt165165erg4"
        "products":[
            {"name":"A", "pos": 2},
            {"name":"B", "pos": 7},
            {"name":"C", "pos": 1},
            {"name":"D", "pos": 5}
        ]
    }]
}

I wish I had :

"page":
{
    "_id": "56rgt46rt54h68rt4h6"
    "categories":
    [{
        "_id": "2g56rt1h65rt165165erg4"
        "products":[
            {"name":"C", "pos": 1},
            {"name":"A", "pos": 2},
            {"name":"D", "pos": 5},
            {"name":"B", "pos": 7}
        ]
    }]
}

And my entities :

/**
 * @MongoDB\EmbeddedDocument
 */
class Category
{
    /**
     * @MongoDB\Id(strategy="auto")
     */
    protected $id;

    /**
     * @MongoDB\String
     */
    protected $name;

    /** @MongoDB\EmbedMany(targetDocument="\Document\Product") */
    private $products = array();
}

/**
 * @MongoDB\EmbeddedDocument
 */
class Product
{
    /**
     * @MongoDB\Int
     */
    protected $pos;

    /**
     * @MongoDB\String
     */
    protected $name;
}

I'm new with DoctrineMongoDBBundle, maybe is not a good idea to cumulate inner array (EmbedMany), and it would be better to have several collections. And so save reference.


Solution

  • Assuming the items in your products array are unique, there isn't any easy server-side support for maintaining this array in sorted order as at MongoDB 2.4. Your best option given the nested arrays will be to sort the arrays in your application logic as appropriate (i.e. on insert/update, or on retrieval/display).

    Data Modeling Considerations

    If you need to do a lot of manipulation of nested array entries, you should consider flattening your data model to make it easier to work with. Your design goal with MongoDB should be to have a data model that is appropriate for your application use cases with acceptable performance balance between ease of inserting/updating/querying. You definitely do not have to model everything in a single collection/query if it doesn't make sense to do so, and you should be prepared to denormalize (duplicate) data. For many-to-many relationship such as products <=> categories it is typical to embed and denormalize whichever entity is less frequently updated (for example, embedding categories in products).

    Persisting sorted, capped arrays (non-unique items)

    If you want to persist arrays in sorted order and the items are not unique, MongoDB 2.4 does have the option to $push to a sorted array but this must be used in conjuction with a slice (array limit). If you $push identical entries to a sorted array you will end up with duplicates (so this likely isn't what you are after).

    Sample update, assuming page in your example was the collection name:

    db.page.update(
        // Query
        { "_id": "56rgt46rt54h68rt4h6" },
    
        // Update sorted array
        // NB: referring by array index position 0, as there isn't a named reference
        { $push : {
            "categories.0.products" : { 
    
                // List of new elements to push
                $each : [
                    { "name" : "E", "pos": 3 }
                ],
    
                // Sort by pos (ascending)
                $sort : { "pos" : 1 },
    
                // Maximum number of array elements (required for $sort)
                $slice : -100
            }
        }}
    )