Search code examples
javatarapache-commons-compress

How to convert TarArchiveOutputStream to byte array without saving into file system


I have a byte array representation of a tar.gz file. I want to get the byte array representation of a new tar.gz file after adding a new config file. I wanted to do this entirely in the code itself without creating any files to the local disk.

Below is my code in java

            InputStream fIn = new ByteArrayInputStream(inputBytes);
            BufferedInputStream in = new BufferedInputStream(fIn);
            GzipCompressorInputStream gzIn = new GzipCompressorInputStream(in);
            TarArchiveInputStream tarInputStream = new TarArchiveInputStream(gzIn);

            ByteArrayOutputStream fOut = new ByteArrayOutputStream();
            BufferedOutputStream buffOut = new BufferedOutputStream(fOut);
            GzipCompressorOutputStream gzOut = new GzipCompressorOutputStream(buffOut);
            TarArchiveOutputStream tarOutputStream = new TarArchiveOutputStream(gzOut);

            ArchiveEntry nextEntry;
            while ((nextEntry = tarInputStream.getNextEntry()) != null) {
                tarOutputStream.putArchiveEntry(nextEntry);
                IOUtils.copy(tarInputStream, tarOutputStream);
                tarOutputStream.closeArchiveEntry();
            }
            tarInputStream.close();
            createTarArchiveEntry("config.json", configData, tarOutputStream);
            tarOutputStream.finish();
            // Convert tarOutputStream to byte array and return





    private static void createTarArchiveEntry(String fileName, byte[] configData, TarArchiveOutputStream tOut)
            throws IOException {

        ByteArrayInputStream baOut1 = new ByteArrayInputStream(configData);

        TarArchiveEntry tarEntry = new TarArchiveEntry(fileName);
        tarEntry.setSize(configData.length);
        tOut.putArchiveEntry(tarEntry);
        byte[] buffer = new byte[1024];
        int len;
        while ((len = baOut1.read(buffer)) > 0) {
            tOut.write(buffer, 0, len);
        }
        tOut.closeArchiveEntry();

    }

How to convert tarOuputStream to byte array?


Solution

  • You have opened the several OutputStream instances, but you haven't closed them yet. Or more precisely, you haven't "flushed" the content, specially the BufferedOutputStream instance.

    BufferedOutputStream is using an internal buffer to "wait" for the data written to the target OutputStream. It does so until there is a reason to do so. One of these "reasons" is to call the BufferedOutputStream.flush() method:

    public void flush() throws IOException

    Flushes this buffered output stream. This forces any buffered output bytes to be written out to the underlying output stream.

    One other "reason" is to close the stream so it will write the remaining bytes before closing the stream.

    In your case the bytes being written are still stored in the internal buffer. Depending on your code structure, you can simply close all the OutputStream instances you have, so the bytes finally gets written to the ByteArrayOutputStream:

    tarInputStream.close();
    createTarArchiveEntry("config.json", configData, tarOutputStream);
    tarOutputStream.finish();
    // Convert tarOutputStream to byte array and return
    tarOutputStream.close();
    gzOut.close();
    buffOut.close();
    fOut.close();
    
    byte[] content = fOut.toByteArray();