Search code examples
rsyncberkeley-db

Can you use rsync to replicate block changes in a Berkeley DB file?


I have a Berkeley DB file that is quite large (~1GB) and I'd like to replicate small changes that occur (weekly) to an alternate location without having the entire file be re-written at the target location.

Does rsync properly handle Berkeley DBs by it's block level algo?

Does anyone have an alternative to only have changes be written to the Berkeley DBs files that are targets of replication?

Thanks!


Solution

  • Rsync handles files perfectly, at the block level. The problem with databases can come into play in a number of ways.

    1. Caching
    2. File locking
    3. Synchronization/transaction logs

    If you can insure that during the period of the rsync, no applications have the berkeley db open, then rsync should work fine, and offer a significent advantage over copying the entire file. However, depending on the configuration and version of bdb, there are transaction logs. You probably want to investigate the same mechanisms used for backups and hot backups. They also have a "snapshot" feature that might better facilitate a working solution.

    You should probably read this carefully: http://www.cs.sunysb.edu/documentation/BerkeleyDB/ref/transapp/archival.html

    I'd also recommend you consider using replication as an alternative solution that is blessed by BDB https://idlebox.net/2010/apidocs/db-5.1.19.zip/programmer_reference/rep.html

    They now call this High Availabity -> http://www.oracle.com/technetwork/database/berkeleydb/overview/high-availability-099050.html