Search code examples
gitdownloadgziptargitweb

How to create stable checksums for auto-generated TarGZ archives?


For a build script, I need to work with source packages of a certain version. In order to not having to include big source archives, the scripts just stores their checksums (SHA1) and downloads them automatically. This works very well for official releases such as

http://download.videolan.org/pub/videolan/libdca/0.0.5/libdca-0.0.5.tar.bz2

However, some packages don't provide an official release, so I download a well-tested version from the version control system. For instance, Gitweb provides the handy "snapshot" feature for downloading a TarGZ archive:

http://git.videolan.org/?p=libbluray.git;a=snapshot;h=cf9ee593f;sf=tgz

Unfortunately, this URL returns a slightly different file on each request. Although it always returns exactly the same tar archive which is always compressed via gzip in the same way, there is a small difference in the timestamp near the beginning of the gzip archive.

Those few bytes make the checksum differ on each download, so the script can't ensure the integrity of the downloaded source archive anymore.

How can I circumvent this issue?


Solution

  • Just zcat $archive |sha1sum it if the tar is stable. Otherwise, you could check out the correct sha1 using git (maybe with --depth 0), or store pristine-tar deltas that let you rebuild a stable archive.