Search code examples
linuxmd5rsyncchecksum

How can I print/log the checksum calculated by rsync?


I have to transfer millons of files of very different size summing up almost 100 TB between two Linux servers. It's easy to do it the first time with rsync, and quite safe, because data can be checksum'ed.

However, I need to keep a list of files and their checksum to do some checks regularly in the future.

Is there a way to tell rsync to print/log the checksum of the file? And in case this is not feasible: Which tool/command would you recommend considering that performance is very important?

Thanks in advance!


Solution

  • It is possible to include the transfer md5 checksum in logging since rsync 3.1.0 (released on 28 Sep 2013):

    Added the "%C" escape to the log-output handling, which will output the MD5 checksum of any transferred file, or all files if --checksum was specified (when protocol 30 or above is in effect).

    For example, the log format %i %f B:%l md5:%C will log each transfer similar to

    >f+++++++++ 00/64235/0664eccc-364e-11e2-af18-57a6d04fd4d5 B:16035388 md5:8ab769aa5224514a41cee0e3e2fe3aad
    

    Take note that this is the md5 sum calculated to verify transfer integrity - it is available even for transfers without the --checksum flag. This change also allows to log the checksum if just one side of the transfer is 3.1.0 or newer. For example, you can have a newer rsync daemon on the target machine do the checksum logging, but send with an older rsync client as long as md5 is used (3.0.0 or newer).