Search code examples
linuxchecksum

Create checksum of large sparse image in linux


I have several sparse images on my linux server (320G total size; 111G used size) and would like to get a checksum of these every night. I was wondering whether there is an efficient way to create the checksum. If I do the following, the checksum creation takes a lot of time:

~ # dd bs=1 count=0 seek=5G if=/dev/zero of=sparse.img
0+0 Datensätze ein
0+0 Datensätze aus
0 Bytes (0 B) kopiert, 0,00036461 s, 0,0 kB/s
~ # du -hs sparse.img
0   sparse.img
~ # time sha512sum sparse.img
e4f21997407b9cb0df347f6eba2...  sparse.img
real    0m55.339s
user    0m52.010s
sys     0m2.790s

Solution

  • There has been a good solution since 2016: starting with version 1.29, GNU tar has: If possible, use SEEK_DATA/SEEK_HOLE to detect sparse files. Detecting sparse files in general is enabled by passing --sparse, so for example: tar -c --sparse <file name> | md5sum gives you a repeatable way to md5sum your file, and only reads the file once.