This is on a Mac if it matters. zip is version 3.0 and unzip is version 6.0 (I expect what is shipped with the OS).
If I do the following:
Start with a generic 'pptx' file, unzip it into a directory, clean up the XML, then zip it up
unzip V1.pptx -d dir
cd dir
find . -name "*.xml" -type f -exec xmllint --output '{}' --format '{}' \;
zip -0 ../V1Orig.pptx -r *
I now have a new zip file V1Orig.pptx
unzip V1Orig.pptx -d copy
cd copy
find . -name "*.xml" -type f -exec xmllint --output '{}' --format '{}' \;
zip -0 ../V1Copy.pptx -r *
If I now 'diff' the orig and copy directories, they are the same:
Common subdirectories: orig/_rels and copy/_rels
Common subdirectories: orig/docProps and copy/docProps
Common subdirectories: orig/ppt and copy/ppt
But if I diff the pptx files or do an md5 checksum on the pptx I get a different answer.
diff V1Orig.pptx V1Copy.pptx
Binary files V1Orig.pptx and V1Copy.pptx differ
ls -rtla orig
total 8
drwxr-xr-x 11 fultonm wheel 352 10 Jan 16:49 ppt
drwxr-xr-x 5 fultonm wheel 160 10 Jan 16:49 docProps
drwxr-xr-x 3 fultonm wheel 96 10 Jan 16:49 _rels
drwxr-xr-x 6 fultonm wheel 192 14 Jan 10:40 .
-rw-r--r-- 1 fultonm wheel 3212 14 Jan 10:42 [Content_Types].xml
drwxr-xr-x 8 fultonm wheel 256 14 Jan 10:57 ..
fultonm@mikes-MacBook-Pro-2 /tmp/handzip>ls -rtla copy
total 8
drwxr-xr-x 5 fultonm wheel 160 14 Jan 10:42 docProps
drwxr-xr-x 3 fultonm wheel 96 14 Jan 10:42 _rels
drwxr-xr-x 6 fultonm wheel 192 14 Jan 10:42 .
drwxr-xr-x 11 fultonm wheel 352 14 Jan 10:42 ppt
-rw-r--r-- 1 fultonm wheel 3212 14 Jan 10:42 [Content_Types].xml
drwxr-xr-x 8 fultonm wheel 256 14 Jan 10:57 ..
You can get them to be the same by making the timestamps of all of the files and directories to be the same, and by using the -X
option to not save extra file attribute information.
So for each zip
command, use -rX
, and in the copy directory do:
find . -exec touch -r ../dir/{} {} \;
before the zip.
Why it should matter that the zip files be identical, I have no idea. What matters is that they both decompress to the same thing.