Search code examples
node.jsmongodbconcurrencylockingdatabase-performance

MongoDB Performance: single collection vs multiple collections for concurrent read/writes


I'm utilizing a local database on my web server to sync certain data from external APIs. The local database would be used to serve the web application. The data I'm syncing is different for each user who would be visiting the web app. Since the sync job is periodically but continuously writing to the DB while users are accessing their data from the web page, I'm wondering what would give me the best performance here.

Since the sync job is continuously writing to the DB, I believe the collection is locked until it's done. I'm thinking that having multiple collections would help here since the lock would be on a particular collection that is being written to rather than on a single collection every time.

Is my thinking correct here? I basically don't want reads to get throttled since the write operation is continuously locking up one collection.


Solution

  • There is an extensive amount of information regarding lock granularity and locking in MongoDB in general here.

    In general, writing to multiple collections, for a small to medium value of "multiple", and assuming all of the collections are created in advance, can be faster than using a single collection, at the cost of queries becoming awkward as well as potentially slow if you have to perform joins via the aggregation pipeline instead of performing a single collection/index scan, for example.

    If you have so many collections that there are so many files open that either the DB or the OS starts evicting files out of their respective caches, performance will start dropping again.

    Creating collections may also be relatively slow, so if this happens under load it may not be very good for performance.