Search code examples
mongodbreplication

Mongodb replica set: Database Size Difference


What are the possible reasons for a difference in database size in primary and secondary nodes of a MongoDB replica set. In my setup, the Secondary node database is of higher size than the primary one. Both nodes have the same number of objects, but the values of "avgObjSize", "dataSize", "storageSize" are higher for secondary node. There is no replication lag as well, as checked from rs.stats()

What can I check?


Solution

  • Brief: Because of different amount of not reclaimed memory space on secondary and different padding factor on secondary and primary.

    Long: It could be the case if you have long running primary node where some documents were deleted and inserted, and no compact operation was run. This space would no be reclaimed, and would be counted in dataSize, avgObjSize and storageSize. Secondary could be fully resynced from primary, but only operations from current oplog would be replayed. In this case secondary could have lower values for dataSize, avgObjSize and storageSize. If after that secondary is elected as primary, you could see described difference in sizes. In addition each server has it's own padding factor, that is why you see difference in dataSize.

    Concrete scenario could be different, but there are two main causes: amount of not reclaimed memory space and different padding factor.