Search code examples
node.jsmongodbmongoskin

rename collection vs updating collection


I have a mongo DB which i need to update daily(delete non relevant documents and add new ones). the DB is not sharded.

I take the data from an external data master which is not so easy to work with.

There are 2 options: 1. reingest the entire DB (not so big) into a temp collection and then rename it to old collection name (with dropTarget set to true) 2. do the hard work myself, delete the old entires, and figure out from the data master which new documents are relavant and insert them to the DB

option 1 is prefrable obviously but what is the impact? I'm doing this maintenance in a late hour but I don't want the users to get errors when querying the DB during the rename process.

Is using rename to overwrite a collection a standard way to get things done or am I abusing the API ? :)


Solution

  • According to the documentation renameCollection blocks all database activity for the duration of the operation. If your users have set a sufficiently large time out , they will not directly be affected by this rename operation, however, as the dataset can change under their feet there might be side effects. For example, renaming a collection can invalidate open cursors which interrupts queries that are currently returning data.

    Regarding renaming of collections in production, personally I would avoid this where possible, firstly because of the cursor issue above, but more importantly because an incomplete renameCollection operation can leave the target collection in an unusable state and require manual intervention to clean up. Instead I would use an update with upsert:true that overwrites the entire document or inserts a new record if it doesn't exist.