Search code examples
mongodbgobsonmgo

mgo - bson.ObjectId vs string id


Using mgo, it seems that best practice is to set object ids to be bson.ObjectId.

This is not very convenient, as the result is that instead of a plain string id the id is stored as binary in the DB. Googling this seems to yield tons of questions like "how do I get a string out of the bson id?", and indeed in golang there is the Hex() method of the ObjectId to allow you to get the string.

The bson becomes even more annoying to work with when exporting data from mongo to another DB platform (this is the case when dealing with big data that is collected and you want to merge it with some properties from the back office mongo DB), this means a lot of pain (you need to transform the binary ObjectId to a string in order to join with the id in different platforms that do not use bson representation).

My question is: what are the benefits of using bson.ObjectId vs string id? Will I lose anything significant if I store my mongo entities with a plain string id?


Solution

  • As was already mentioned in the comments, storing the ObjectId as a hex string would double the space needed for it and in case you want to extract one of its values, you'd first need to construct an ObjectId from that string.

    But you have a misconception. There is absolutely no need to use an ObjectId for the mandatory _id field. Quite often, I advice against that. Here is why.

    Take the simple example of a book, relations and some other considerations set aside for simplicty:

    {
      _id: ObjectId("56b0d36c23da2af0363abe37"),
      isbn: "978-3453056657",
      title: "Neuromancer",
      author: "William Gibson",
      language: "German"
    }
    

    Now, what use would have the ObjectId here? Actually none. It would be an index with hardly any use, since you would never search your book databases by an artificial key like that. It holds no semantic value. It would be a unique ID for an object which already has a globally unique ID – the ISBN.

    So we simplify our book document like this:

    {
      _id: "978-3453056657",
      title: "Neuromancer",
      author: "William Gibson",
      language: "German"
    }
    

    We have reduced the size of the document, make use of a preexisting globally unique ID and do not have a basically unused index.

    Back to your basic question wether you loose something by not using ObjectIds: Quite often, not using the ObjectId is the better choice. But if you use it, use the binary form.