Search code examples
perlzip

Perl Archive::Zip Issue. Zip files become un-openable when using addString with 'large' files


So I have a Perl system that is taking emails and splitting them into JSON files and attachments, then zipping them up. My process works really well, however I have noticed that the output zip file (created using Archive::Zip) is actually invalid when an input attachment is over roughly 10MB.

my code is:

my $zip = Archive::Zip->new();
$count = 0;
foreach $attachment (@atachments) {
   my $member = $zip->addString(@attachments[$count], @attachment_names[$count])
   $member->desiredCompressionMethod( COMPRESSION_STORED );
   $count++;
}
my $jsonMember = $zip->addString($json, $fn . '.json')
$jsonMember->desiredCompressionMethod( COMPRESSION_STORED );

unless ( $zip->writeToFileNamed($out) == AZ_OK ) {
    die 'zip error';
}

So this works completely fine for things less than 10MB, however randomly whenever an attachment is more than 10MB, it stops working as expected.... the output zip file is seen in Windows as being less than I think it should be and the zip file cannot be opened/extracted.

I have tried upgrading Archive::Zip to version 1.68 and no luck. I have tried different compression levels and also no luck.

the string itself is fine and I can verify its all there... but I was wondering if someone knew if addString had a potential string limit somewhere.

Could it potentially be varying the zip type based on the filesize, I see some stuff online about zip64, but that mentions only after 4GB...

Any help would be greatly appreciated.


Solution

  • Your code does not compile - there are missing semicolons are the end of some of the lines.

    Here is a modified version that runs stand-alone.

    #!/usr/bin/perl
    
    use strict;
    use warnings;
    
    use Archive::Zip qw( :ERROR_CODES :CONSTANTS );
    
    my $out = "test.zip";
    my @attachments = ('alpha'  x (1024 * 1024 * 10), # create a large file
                       'beta'   x (1024 * 1024 * 10), # create a large file,
                       'gaamma' x (1024 * 1024 * 10), # create a large file
                      );
    
    my @attachment_names = ('name1', 'name2', 'name3');
    
    my $zip = Archive::Zip->new();
    
    for my $count (0 .. @attachments -1) {
       my $member = $zip->addString($attachments[$count], $attachment_names[$count]) ;
       $member->desiredCompressionMethod( COMPRESSION_STORED );
    }
    
    my $json = 'json';
    my $fn = 'something';
    my $jsonMember = $zip->addString($json, $fn . '.json') ;
    $jsonMember->desiredCompressionMethod( COMPRESSION_STORED );
    
    unless ( $zip->writeToFileNamed($out) == AZ_OK ) {
        die 'zip error';
    }
    
    

    Running that produced a file called test.zip.

    $ unzip -l test.zip 
    Archive:  test.zip
      Length      Date    Time    Name
    ---------  ---------- -----   ----
     52428800  2023-10-10 15:54   name1
     41943040  2023-10-10 15:54   name2
     62914560  2023-10-10 15:54   name3
            4  2023-10-10 15:54   something.json
    ---------                     -------
    157286404                     4 files
    
    

    The zip file is valid

    $ unzip -t test.zip 
    Archive:  test.zip
        testing: name1                    OK
        testing: name2                    OK
        testing: name3                    OK
        testing: something.json           OK
    No errors detected in compressed data of test.zip.
    

    A 10 meg file is perfectly normal file for Archive::Zip. You only need zip64 if

    • an individual file is > 4Gig
    • the individual entries are less than 4 gig, but the combinded size of the zip file is > 4 gig
    • there are more than 64k entries in the zip file

    If none of those apply, the problem is elsewhere & you need to supply a reproducible example.