Search code examples
phpphar

How to recreate PHAR files with identical sha1sums at different times?


I'm working on a command-line PHP project and want to be able to recreate the PHAR file that is my deployment artifact. The challenge is that I can't create two PHAR's that have identical sha1sums and were created more than 1 second apart from each other. I would like to be able to exactly recreate my PHAR file if the input files are the same (i.e. came from the same git commit).

The following code snippet demonstrates the problem:

#!/usr/bin/php
<?php
$hashes = array();
$file_names = array('file1.phar','file2.phar');

foreach ($file_names as $name) {
  if (file_exists($name)) {
    unlink($name);
  }
  $phar = new Phar($name);
  $phar->addFromString('cli.php', "cli\n");
  $hashes[]=sha1_file($name);
  // remove the sleep and the PHAR's are identical.
  sleep(1);
}
if ($hashes[0]==$hashes[1]) {
  echo "match\n";
} else {
  echo "do not match\n";
}

As far as I can tell, the "modification time" field for each file in the PHAR manifest is always set to the current time, and there seems to be no way or overriding that. Even touch("phar://file1.phar/cli.php", 1413387555) gives the error:

touch(): Can not call touch() for a non-standard stream

I ran the above code in PHP 5.5.9 on ubuntu trusty and PHP 5.3 on RHEL5 and both versions behave the same way and fail to create identical PHAR files.

I'm trying to do this in order to follow the advice in the book Continuous Deployment by Jez Humble and David Farley

Any help is appreciated.


Solution

  • The Phar class currently does not allow users to alter or even access the modifiction time. I thought of storing your string into a temporary file and using touch to alter the mtime, but that does not seem to have any effect. So you'll have to manually change the timestamps in the created files and then regenerate the archive signature. Here's how to do it with current PHP versions:

    <?php
        $filename = "file1.phar";
    
        $archive = file_get_contents($filename);
    
        # Search for the start of the archive header
        # See http://php.net/manual/de/phar.fileformat.phar.php
        # This isn't the only valid way to write a PHAR archive, but it is what the Phar class
        # currently does, so you should be fine (The docs say that the end-of-PHP-tag is optional)
        $magic = "__HALT_COMPILER(); ?" . ">";
        $end_of_code = strpos($archive, $magic) + strlen($magic);
        $data_pos = $end_of_code;
    
        # Skip that header
        $data = unpack("Vmanifest_length/Vnumber_of_files/vapi_version/Vglobal_flags/Valias_length", substr($archive, $end_of_code, 18));
        $data_pos += 18 + $data["alias_length"];
        $metadata = unpack("Vlength", substr($archive, $data_pos, 4));
        $data_pos += 4 + $metadata["length"];
    
        for($i=0; $i<$data["number_of_files"]; $i++) {
            # Now $data_pos points to the first file
            # Files are explained here: http://php.net/manual/de/phar.fileformat.manifestfile.php
            $filename_data = unpack("Vfilename_length", substr($archive, $data_pos, 4));
            $data_pos += 4 + $filename_data["filename_length"];
            $file_data = unpack("Vuncompressed_size/Vtimestamp/Vcompressed_size/VCRC32/Vflags/Vmetadata_length", substr($archive, $data_pos, 24));
            # Change the timestamp to zeros (You can also use some other time here using pack("V", time()) instead of the zeros)
            $archive = substr($archive, 0, $data_pos + 4) . "\0\0\0\0" . substr($archive, $data_pos + 8);
            # Skip to the next file (it's _all_ the headers first, then file data)
            $data_pos += 24 + $file_data["metadata_length"];
        }
    
        # Regenerate the file's signature
        $sig_data = unpack("Vsigflags/C4magic", substr($archive, strlen($archive) - 8));
        if($sig_data["magic1"] == ord("G") && $sig_data["magic2"] == ord("B") && $sig_data["magic3"] == ord("M") && $sig_data["magic4"] == ord("B")) {
            if($sig_data["sigflags"] == 1) {
                # MD5
                $sig_pos = strlen($archive) - 8 - 16;
                $archive = substr($archive, 0, $sig_pos) . pack("H32", md5(substr($archive, 0, $sig_pos))) . substr($archive, $sig_pos + 16);
            }
            else {
                # SHA1
                $sig_pos = strlen($archive) - 8 - 20;
                $archive = substr($archive, 0, $sig_pos) . pack("H40", sha1(substr($archive, 0, $sig_pos))) . substr($archive, $sig_pos + 20);
            }
            # Note: The manual talks about SHA256/SHA512 support, but the according flags aren't documented yet. Currently,
            # PHAR uses SHA1 by default, so there's nothing to worry about. You still might have to add those sometime.
        }
    
        file_put_contents($filename, $archive);
    

    I've written this ad-hoc for my local PHP 5.5.9 version and your example above. The script will work for files created similar to your example code from above. The documentation hints to some valid deviations from this format. There are comments at the according lines in the code; you might have to add something there if you want to support general Phar files.