Search code examples
javaaxiosdownloadzip

Zip from Memory and Hard Drive are different which results in a corrupted Zip after download


Problem is fixed. See below.

Hey everyone I want to realize a zip download. I have a Web-Application where the User can click a Button:

<template>
  <button name="exportButton" id="exportButton" @click="convertConfigurationToZip" :class="['button']">{{
      exportConfig
    }}
  </button>
</template>

methods: {
    async convertConfigurationToZip() {
      // Stores the current config in "store.config"
      RestResource.convertConfigurationToZip();

      const configData = store.config;
      await new Promise((resolve) => setTimeout(resolve, 2000));

      // A Blob represents raw binary data
      const blob = new Blob([configData], {type: 'application/zip'});

      // Use file-saver to trigger the download
      saveAs(blob, 'config')

The Function called in the RestResource is the following:

convertConfigurationToZip: () => {
        axios
            .get("http://localhost:8080/api/conversion/convertConfigurationToZip/")
            .then((response: any) => {
                console.log(response.data);
                store.config = response.data;
            })
            .catch((error: any) => {
                console.log(error);
            })
    }

The Controller that is being called looks like this:

package controller;

import io.swagger.annotations.*;
import jakarta.annotation.Resource;
import jakarta.ws.rs.GET;
import jakarta.ws.rs.Path;
import jakarta.ws.rs.container.AsyncResponse;
import jakarta.ws.rs.container.Suspended;
import jakarta.ws.rs.core.Response;
import services.ConversionService;

import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

// This Controller should be used for creating zip files that contains the ProCake-Configuration
@Api(tags = "Conversion Controller", authorizations = { @Authorization(value = "basicAuth") })
@Path("/conversion")
public class ConversionController
{
    @Resource
    ExecutorService executorService;

    @ApiOperation(value = "Convert Config into Zip")
    @ApiResponses({ @ApiResponse(code = 200, message = "Config as Zip"), @ApiResponse(code = 503, message = "ProCAKE not started/configured properly") })
    @GET
    @Path("/convertConfigurationToZip")
    public void convertConfigurationToZip(@Suspended final AsyncResponse response)
    {
        executorService = Executors.newSingleThreadExecutor();
        executorService.submit(() ->
        {
            try
            {
                response.resume(Response.status(200).entity(new ConversionService().convertConfigurationToZip()).build());
            }
            catch (Exception e)
            {
                response.resume(e);
            }
            executorService.shutdown();
        });
    }
}

The Service where the Zip File is being handled looks like this:

package services;

import de.uni_trier.wi2.procake.data.model.Model;
import de.uni_trier.wi2.procake.data.model.ModelFactory;
import de.uni_trier.wi2.procake.similarity.SimilarityModel;
import de.uni_trier.wi2.procake.similarity.SimilarityModelFactory;
import de.uni_trier.wi2.procake.utils.io.IOUtil;
import org.apache.commons.compress.archivers.zip.ZipArchiveEntry;
import org.apache.commons.compress.archivers.zip.ZipArchiveOutputStream;
import org.apache.commons.io.IOUtils;
import org.apache.jena.sparql.exec.RowSet;

import java.io.*;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.zip.ZipInputStream;

public class ConversionService
{
    public byte[] convertConfigurationToZip() throws IOException
    {
        try
        {
            // Create Zip on Harddrive
            List<File> listOfFiles = new ArrayList<File>();
            File hello = new File("pathToFile");
            File world = new File("pathToFile");
            listOfFiles.add(hello);
            listOfFiles.add(world);
            IOUtil.createZipFile("pathToZip", listOfFiles);
            // ---------------------

            // Create Zip File in Memory
               ByteArrayOutputStream baos = IOUtil.createZipFileInMemory(listOfFiles);
;
            // ---------------------

            // Compare both Zip Files
            byte[] zipContentHardDrive = Files.readAllBytes(Paths.get("pathToFile"));
            byte[] zipContentMemory = baos.toByteArray();

            int minLength = Math.min(zipContentMemory.length, zipContentHardDrive.length);

            if (!Arrays.equals(
                            Arrays.copyOf(zipContentMemory, minLength),
                            Arrays.copyOf(zipContentHardDrive, minLength))) {
                throw new IOException("The Zip Files are different.");
            }
            // ---------------------

            return zipContentMemory;
        }catch (Exception e) {
            System.out.println("Could not create Zip File");
        }
        return null;
    }
}

And Finally before I give more detail to my problem here are the IOUtil methods:

public static ByteArrayOutputStream createZipFileInMemory(List<File> files) throws IOException {
    ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
    ZipArchiveOutputStream zipArchiveOutputStream = new ZipArchiveOutputStream(
        byteArrayOutputStream);

    for (File file : files) {
      addZipEntry(zipArchiveOutputStream, file);
    }

    zipArchiveOutputStream.finish();
    return byteArrayOutputStream;
  }
private static void addZipEntry(ArchiveOutputStream archiveOutputStream, File file)
      throws IOException {
    String entryName = file.getName();
    ArchiveEntry archiveEntry = new ZipArchiveEntry(file, entryName);
    archiveOutputStream.putArchiveEntry(archiveEntry);
    if (file.isFile()) {
      Files.copy(file.toPath(), archiveOutputStream);
    }
    archiveOutputStream.closeArchiveEntry();
  }

createZipFile is working completely fine. My Problem is that the zip File I get from the download is about 200 Byte bigger than the one that I create directly on my hard drive. I also can't open the bigger zip file. After opening the bigger zip file with vim I can see that the content is completely different compared to the smaller zip file:

smaller zip: " zip.vim version v33 " Browsing zipfile pathToZip " Select a file with cursor and press ENTER

hello.txt world.txt

and here is the bigger one: PK^C^D^T^@^H^H^H^@���W^@^@^@^@^@^@^@^@^@^@^@^@ ^@5^@hello.txtUT^M^@^G���eִ�e���e

I dont know if my method for creating the zip file in memory is wrong or if something goes wrong with the transfer to store.config. One more thing noteworthy is that zipContentHardDrive and zipContentMemory are different: zipContentHarddrive: [80, 75, 3, 4, 20, 0, 0, 8, 8, 0, -72, -114, -103, 87, 32, 48, 58, 54, 8, 0, 0, 0, 6, 0, 0, 0, 9, 0, 53, 0, 104, 101, 108, 108, 111, 46, 116, 120, 116, 85, 84, 13, 0, 7, -99, -77, -119, 101, -42, -76, -119, 101, -99, -77, -119, 101, 10, 0, 32, 0, 0, 0, 0, 0, 1, 0, 24, 0, -106, 85, -35, -18, 82, 55, -38, 1, 4, 18, -88, -87, 83, 55, -38, 1, -128, 76, -114, -18, 82, 55, -38, 1, -53, 72, -51, -55, -55, -25, 2, 0, +322 more]

zipContentMemory: [80, 75, 3, 4, 20, 0, 8, 8, 8, 0, -72, -114, -103, 87, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 0, 53, 0, 104, 101, 108, 108, 111, 46, 116, 120, 116, 85, 84, 13, 0, 7, -99, -77, -119, 101, -42, -76, -119, 101, -99, -77, -119, 101, 10, 0, 32, 0, 0, 0, 0, 0, 1, 0, 24, 0, -106, 85, -35, -18, 82, 55, -38, 1, 4, 18, -88, -87, 83, 55, -38, 1, -128, 76, -114, -18, 82, 55, -38, 1, -53, 72, -51, -55, -55, -25, 2, 0, +354 more]

I'm not sure if its because zip files are normally different if they are in memory and when not.

I would really appreciate if someone could help me understand and fix my problem. Thank you very much for taking the time!

Update 1 After following test I found out that for reasons unknown the createZipFileInMemory method does not work like it should in the CoonversionService.

@Test
  public void testCreateZipFileInMemoryWithFiles() throws IOException {
    IOUtil.writeFile(graph, PATH_GRAPH_TXT);
    IOUtil.writeFile(graph, PATH_GRAPH_XML);

    File fileText = new File(PATH_GRAPH_TXT);
    File fileXML = new File(PATH_GRAPH_XML);

    List<File> listOfFiles = new ArrayList<>();
    listOfFiles.add(fileText);
    listOfFiles.add(fileXML);

    ByteArrayOutputStream zipFileInMemory = IOUtil.createZipFileInMemory(listOfFiles);
    assertNotNull(zipFileInMemory);

    // Now let's check if the data stored in the memory is actually the right Zip-file

    // Therefore let's create an actual Zip-File on our hard drive
    IOUtil.createZipFile(PATH_CREATED_ZIP, listOfFiles);
    
    // Read the content of the created zip file
    byte[] actualZipContent = Files.readAllBytes(Paths.get(PATH_CREATED_ZIP));
    
    // Compare the contents of the files within the zip
    ZipInputStream dataOfMemoryZip = new ZipInputStream(new ByteArrayInputStream(zipFileInMemory.toByteArray()));
    ZipInputStream dataOfHardDriveZip = new ZipInputStream(new ByteArrayInputStream(actualZipContent));

    ZipEntry memoryEntry;
    ZipEntry hardDriveEntry;

    while ((memoryEntry = dataOfMemoryZip.getNextEntry()) != null) {
      hardDriveEntry = dataOfHardDriveZip.getNextEntry();

      assertEquals(memoryEntry.getName(), hardDriveEntry.getName());

      byte[] expectedFileContent = IOUtils.toByteArray(dataOfMemoryZip);
      byte[] actualFileContent = IOUtils.toByteArray(dataOfHardDriveZip);

      assertArrayEquals(expectedFileContent, actualFileContent);
    }
  }

here the contents are always the same. I do not understand why the contents are different in the ConversionService class since I use the same files and the same methods.

Update 2 As requested here is the hexdump of the downloaded zip file: Here is the hexdump of the zip that I get from the download:

504b0304140000080800efbfbdefbfbdefbfbd5720303a36080000000600 00000900350068656c6c6f2e74787455540d0007efbfbdefbfbdefbfbd65 d6b4efbfbd65efbfbdefbfbdefbfbd650a0020000000000001001800efbf bd55efbfbdefbfbd5237efbfbd010412efbfbdefbfbd5337efbfbd01efbf bd4cefbfbdefbfbd5237efbfbd01efbfbd48efbfbdefbfbdefbfbdefbfbd 0200504b0304140000080800efbfbdefbfbdefbfbd57efbfbd6138efbfbd 080000000600000009003500776f726c642e74787455540d0007efbfbdef bfbdefbfbd65efbfbdefbfbdefbfbd65efbfbdefbfbdefbfbd650a002000 000000000100180009efbfbd28efbfbd5237efbfbd01efbfbdefbfbdefbf bdefbfbd5237efbfbd01efbfbd5aefbfbdefbfbd5237efbfbd012befbfbd 2fefbfbd49efbfbd0200504b01021400140000080800efbfbdefbfbdefbf bd5720303a36080000000600000009002d00000000000000000000000000 000068656c6c6f2e7478745554050007efbfbdefbfbdefbfbd650a002000 0000000001001800efbfbd55efbfbdefbfbd5237efbfbd010412efbfbdef bfbd5337efbfbd01efbfbd4cefbfbdefbfbd5237efbfbd01504b01021400 140000080800efbfbdefbfbdefbfbd57efbfbd6138efbfbd080000000600 000009002d000000000000000000000064000000776f726c642e74787455 54050007efbfbdefbfbdefbfbd650a002000000000000100180009efbfbd 28efbfbd5237efbfbd01efbfbdefbfbdefbfbdefbfbd5237efbfbd01efbf bd5aefbfbdefbfbd5237efbfbd01504b05060000000002000200efbfbd00 0000efbfbd0000000000

and here is the hexdump of the working zip file:

504b0304140000080800b88e995720303a36080000000600000009003500 68656c6c6f2e74787455540d00079db38965d6b489659db389650a002000 00000000010018009655ddee5237da010412a8a95337da01804c8eee5237 da01cb48cdc9c9e70200504b0304140000080800c08e9957a86138dd0800 00000600000009003500776f726c642e74787455540d0007a9b38965abb3 8965a9b389650a002000000000000100180009ba28f65237da0183ddeaf6 5237da01805ab5f55237da012bcf2fca49e10200504b0102140014000008 0800b88e995720303a36080000000600000009002d000000000000000000 00000000000068656c6c6f2e74787455540500079db389650a0020000000 0000010018009655ddee5237da010412a8a95337da01804c8eee5237da01 504b01021400140000080800c08e9957a86138dd08000000060000000900 2d000000000000000000000064000000776f726c642e7478745554050007 a9b389650a002000000000000100180009ba28f65237da0183ddeaf65237 da01805ab5f55237da01504b05060000000002000200c8000000c8000000 0000

Problem Fix: Here are the changes:

  methods: {
    async convertConfigurationToZip() {
      if (this.conversionInProgress) {
        return;
      }

      this.conversionInProgress = true;
      // Stores the current config in "store.config"
      const response = await RestResource.convertConfigurationToZip();

      // A Blob represents raw binary data
      const blob = new Blob([response.data], {type: 'application/zip'}); // #

      saveAs(blob, 'config.zip');
      this.conversionInProgress = false;
    }

and

convertConfigurationToZip: async () => {
        return axios.get("http://localhost:8080/api/conversion/convertConfigurationToZip/", {
            responseType: "blob", // Ensure the response is treated as a blob
        })
            .then((response) => {
                return response; // Return the entire Axios response object
            })
            .catch((error) => {
                console.error(error);
                throw error; // Rethrow the error to handle it appropriately in the calling context
            });
    }

I assume that const configData = store.config was a string instead of binary data.


Solution

  • The partial decimal dumps of the zip files you created are indeed different.

    zipContentMemory has created a streamed zip file, whilst zipContentHarddrive has not. That could be enough to explain why one zip file works, but the other doesn't.

    Note: without seeing the complete zip files it isn't possible to be 100% certain of that. There may be some other difference in the zip files that is causing the issue you are seeing. Post a dump of the complete zips if you get a chance.

    The partial zipContentHarddrive zip file looks like this (data below created with zipdetails

    0000 LOCAL HEADER #1       04034B50 (67324752)
    0004 Extract Zip Spec      14 (20) '2.0'
    0005 Extract OS            00 (0) 'MS-DOS'
    0006 General Purpose Flag  0800 (2048)
         [Bits 1-2]            0 'Normal Compression'
         [Bit 11]              1 'Language Encoding'
    0008 Compression Method    0008 (8) 'Deflated'
    000A Last Mod Date/Time    57E6F1C7 (1474752967) 'Invalid Date or Time'
    000E CRC                   363A3020 (909783072)
    0012 Compressed Size       00000008 (8)
    0016 Uncompressed Size     00000006 (6)
    001A Filename Length       0009 (9)
    001C Extra Length          0035 (53)
    001E Filename              'hello.txt'
    0027 Extra ID #1           5455 (21589) 'Extended Timestamp [UT]'
    0029   Length              000D (13)
    002B   Flags               07 (7) 'mod access change'
    002C   Mod Time            65F6CCE2 (1710673122) 'Sun Mar 17 10:58:42 2024'
    0030   Access Time         65F6CBA9 (1710672809) 'Sun Mar 17 10:53:29 2024'
    0034   Change Time         65F6CCE2 (1710673122) 'Sun Mar 17 10:58:42 2024'
    0038 Extra ID #2           000A (10) 'NTFS FileTimes'
    003A   Length              0020 (32)
    003C   Reserved            00000000 (0)
    0040   Tag1                0001 (1)
    0042   Size1               0018 (24)
    0044   Mtime               01A5375291A255E9 (118561792965367273) 'Thu Sep 16 07:08:16 1976 536727300ns'
    004C   Atime               01A53753D6D71204 (118561798421418500) 'Thu Sep 16 07:17:22 1976 141850000ns'
    0054   Ctime               01A5375291F14CFF (118561792970542335) 'Thu Sep 16 07:08:17 1976 54233500ns'
    

    The partial zipContentMemory zip file looks like this

    0000 LOCAL HEADER #1       04034B50 (67324752)
    0004 Extract Zip Spec      14 (20) '2.0'
    0005 Extract OS            00 (0) 'MS-DOS'
    0006 General Purpose Flag  0808 (2056)
         [Bits 1-2]            0 'Normal Compression'
         [Bit  3]              1 'Streamed'
         [Bit 11]              1 'Language Encoding'
    0008 Compression Method    0008 (8) 'Deflated'
    000A Last Mod Date/Time    57E6F1C7 (1474752967) 'Invalid Date or Time'
    000E CRC                   00000000 (0)
    0012 Compressed Size       00000000 (0)
    0016 Uncompressed Size     00000000 (0)
    001A Filename Length       0009 (9)
    001C Extra Length          0035 (53)
    001E Filename              'hello.txt'
    0027 Extra ID #1           5455 (21589) 'Extended Timestamp [UT]'
    0029   Length              000D (13)
    002B   Flags               07 (7) 'mod access change'
    002C   Mod Time            65F6CCE2 (1710673122) 'Sun Mar 17 10:58:42 2024'
    0030   Access Time         65F6CBA9 (1710672809) 'Sun Mar 17 10:53:29 2024'
    0034   Change Time         65F6CCE2 (1710673122) 'Sun Mar 17 10:58:42 2024'
    0038 Extra ID #2           000A (10) 'NTFS FileTimes'
    003A   Length              0020 (32)
    003C   Reserved            00000000 (0)
    0040   Tag1                0001 (1)
    0042   Size1               0018 (24)
    0044   Mtime               01A5375291A255E9 (118561792965367273) 'Thu Sep 16 07:08:16 1976 536727300ns'
    004C   Atime               01A53753D6D71204 (118561798421418500) 'Thu Sep 16 07:17:22 1976 141850000ns'
    0054   Ctime               01A5375291F14CFF (118561792970542335) 'Thu Sep 16 07:08:17 1976 54233500ns'
    

    UPDATE

    The original file, copied to 2.zip, is fine

    $ unzip -t /tmp/2.zip
    Archive:  /tmp/2.zip
        testing: hello.txt                OK
        testing: world.txt                OK
    No errors detected in compressed data of /tmp/2.zip.
    

    The downloaded file, copied to /tmp/1.zip, is very corrupt zip file

     unzip -t /tmp/1.zip
    Archive:  /tmp/1.zip
    
    caution:  zipfile comment truncated
    error [/tmp/1.zip]:  missing 3232546215 bytes in zipfile
      (attempting to process anyway)
    error [/tmp/1.zip]:  attempt to seek before beginning of zipfile
      (please check that you have transferred or created the zipfile in the
      appropriate BINARY mode and that you have compiled UnZip properly)
    
    

    Looking at the hex dump there are a lot of repeated sequences of EF BF BD. Those three bytes happen to be the UTF8 byte sequence for the Unicode replacement character �.

    That suggests the root cause is your server code or your client code is processing the zip file as a UTF8 string, rather than a sequence of bytes.