Search code examples
mongodbmongodb-cnosql

Mongo C driver, updating documents as fast as possible


Quite simply, I need to store time series data in a document. I have decided that having a document responsible for a 30 minute period of data is reasonable. The document could look like this:

But this is only one of about a few hundred/thousand documents that will be updated every second.

{
    _id: "APAC.tky001.cpu.2011.12.04:10:00",
    field1: XX,
    field2: YY,
    1322971800: 22,
    1322971801: 23,
    1322971802: 21,

    // and so on
 }

This means that every 30 minutes, I create the document with _id, field1 and field2. Then, every second I would like to add a timestamp/value combination.

I am using the mongo c library, I was assuming it would be superfast but the way I am doing this requires an mongo_update which cannot be done in bulk. I don't think there's a way to use mongo_insert_batch.

Unfortunately, it's super slow - terrible performance. Am I doing this completely incorrectly? By terrible, I mean that by doing some crude work I get 600/second, in an alternate db (not naming names) I get 27,000/sec.

The code is approximately:

for (i=0;i<N;i++) {
    if (mongo_update(c,n,a,b,MONGO_UPDATE_UPSERT,write_concern) != MONGO_OK)
        // stuff
}

setting write concern off or on makes no difference.


Solution

  • Your updates are likely to grow documents out of bounds each time. This means that update is no longer cheap, because mongo has to copy the document to a new location. You could manually pad documents by inserting some large dummy value when creating the document and removing it later, so that your updates happen in-place. I'm not sure if you can manipulate collection-level paddingFactor directly.

    In that another unnamed database you probably insert a row per entry, which is totally different operation from what you are doing here.