Search code examples
javacoldfusioncompressiontar

Create a TAR File (tarball) using Coldfusion\Java on a windows server


I need to create a TAR file containing multiple files on a windows server using Coldfusion\Java. I have found lots of examples of unpacking them, but very little on creating them. I found this example of using gzip to add some text to a file and that works, but I need to add files. I'm also not 100% sure that gzip is the same thing as building a tarball. This project was assigned to me with a very short turn-around and I'm spinning my wheels so any help in the right direction is greatly appreciated:

Win Server 2012, ColdFusion 10, Java Version 1.7.0_15

    <cfset lineBreak = chr(13) & chr(10) />
<!--- open the sitemap file --->
<cfset tarFilePath = "#application.imageingFolder#DTSimages\Pending\tiff.gz" />
#tarFilePath#
<!--- create streams --->
<cfset outputStream = CreateObject("java", "java.io.FileOutputStream").Init(
            CreateObject("java","java.io.File").Init(tarFilePath)) />
<cfset gzipStream = CreateObject("java", "java.util.zip.GZIPOutputStream").Init(outputStream) />
<cfsavecontent variable="siteMapHeader"><?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
    xmlns:image="http://www.sitemaps.org/schemas/sitemap-image/1.1"
    xmlns:video="http://www.sitemaps.org/schemas/sitemap-video/1.1">
</cfsavecontent>
<cfset siteMapFooter = "</urlset>" />
<cfset gzipStream.write(ToString(siteMapHeader).GetBytes()) />

<cfset gzipStream.close() />
<cfset outputStream.close() />

Solution

  • Though often used together, tar and gzip are different things. From ZIP vs. GZIP:

    ... The tar command is used to create an archive (not compressed) and another program (gzip or compress) is used to compress the archive.

    A simple option is to install 7-Zip (which supports tar and gzip) and invoke it from cfexecute with the appropriate arguments. To create a TAR file (not compressed):

    <!--- 
       "a" - add files to archive 
       "t" - Type of archive 
       output .tar file 
       space separated list of files to add 
    --->
    <cfexecute name="c:\Program Files\7-Zip\7z.exe"
        arguments=" a -ttar c:\path\myarchive.tar c:\path\file1.xlsx c:\temp\otherfile.txt"
        variable="output"
        errorVariable="error"
        timeout="60"    />
    
    <cfoutput>
        output = #output#<br>
        error = #error#<br>
    </cfoutput> 
    

    For java options, this thread mentions using the Apache Commons library for creating TAR files. It also happens to bundled with CF so it should work out of the box:

    http://www.oracle.com/technetwork/articles/java/compress-1565076.html

    <cfscript>
        // Initialize TAR file to generate
        outputPath = "c:/temp/outputFile3.tar";
        os = createObject("java", "java.io.FileOutputStream").init(outputPath);
        tar = createObject("java", "org.apache.commons.compress.archivers.tar.TarArchiveOutputStream").init(os);
        
        // Add an entry from a string
        someTextContent = '<?xml version="1.0" encoding="UTF-8"?>....';
        binaryContent = charsetDecode(someTextContent, "utf-8");
        entry = createObject("java", "org.apache.commons.compress.archivers.tar.TarArchiveEntry").init("siteHeader.xml");
        entry.setSize(arrayLen(binaryContent));
        tar.putArchiveEntry(entry);
        tar.write(binaryContent);
        tar.closeArchiveEntry();
        
        // Create an entry from a file
        inputFile = createObject("java", "java.io.File").init("c:/path/someImage.jpg");
        entry = tar.createArchiveEntry(inputFile, "myImage.jpg");
        tar.putArchiveEntry(entry);
        tar.write(FileReadBinary(inputFile));
        tar.closeArchiveEntry();
    
        // Close TAR file
        tar.flush();
        tar.close();
    </cfscript>
    

    See the documentation for more details: Apache Commons - The TAR package

    NB: If you are archiving large files, look into buffering. For the basic concept, see Code Sample 3: Zip.java. Ignore the fact that it is for Zip files. The basic concept is the same, only the classes differ.