Search code examples
mongodbdata-modelingdatamodel

Best Mongodb Data Model for Response time statistic website


In my project, I have servers that will send ping request to websites, measuring their response time and store it every minute.

I'm going to use Mongodb and i'm searching for best data model. which data model is better?

1- have a collection for each website and each request as a document. (1000 collection)

or

2- have a collection for all websites and each website as a document and each request as sub-document.


Solution

  • Both solutions should face of one certain limitation of mongodb. With the first one, that you said each website a collection, the limitation is in the number of the collections while each one will have a namespace entry and the namespace size is 16MB so around 16.000 entries can fit in. (the size of the namespace can be increased) In my opinion this is a much better solution while you said 1000 collections are expected and it can be handled. (Should be considered that indexes has their own namespace entries and count in the 16.000). In this case you can store the entries as documents you can handle them after generally much easier than with the embedded array.

    Embedded array limitation. This limitation in the second case is a hard one. Your documents cannot grow bigger than 16MB. This one is BSON size and it can store quite many things inside documents but if you use huge documents which varies in size , and change size in time your storage will get fragmented. The reason is that will be clear if you watch this webinar . Basically this is the worth what you can do in terms of storage usage.

    If you likely to use aggregation framework for further analysis it will be also harder with the embedded array concept.