Search code examples
javagziptargzipinputstream

Java TarInputStream read file names of a tar.gz file which included another tar.gz file


I'm trying to get file names of a tar.gz file inside another tar.gz file. Here is the sample code;

try (TarInputStream tis = getStreamRemoteTarGz(url)) { // accessing the first tar.gz file
    TarEntry e;
    while ((e = tis.getNextEntry()) != null) {
        if (e.getName().endsWith(".tar.gz")) {
             // accessing the inner tar.gz file (java.io.FileNotFoundException: inner_tar_file.tar.gz (No such file or directory))
             try (TarInputStream innerTis = new TarInputStream(new GZIPInputStream(new FileInputStream(entry.getName())))) {
                ....
        }
    }

As a result, I got FileNotFoundException: inner_tar_file.tar.gz. The file name (inner_tar_file.tar.gz) is correct, I can access the name of inner tar.gz file, but I want to access file names that included in this tar.gz file. How can I access file names of inner tar.gz file using TarInputStream?


Solution

  • The content of a tar file is not in itself a file. At least not in the sense of java.io.File, which represents a real physical file directly on a file system).

    java.io.File represents anything that can be adressed with a path like C:\somePath\myFile.ext (on Windows) or /home/user/somepath/myFile.ext (on most other OS). There's no such path that points directly to an entry inside a .tar.gz that the operating system would understand.

    Instead of creating a new FileInputStream simply pass in the TarInputStream named tis.