I can get a text file as String
with new String(Files.readAllBytes(Paths.get(path)), StandardCharsets.UTF_8)
. How do I achieve the same result if the file is in a folder which is in a zip file? I know I can get the zip as a ZipFile
and the folder as a ZipEntry
but I'm not clear on how I get the file nor how I make a String
out of it. I don't want to create any files or folders to get it.
EDIT: Per dpr's answer, here's what I used:
String fileAsString;
try (ZipFile zip = new ZipFile(path)) {
ZipEntry entry = zip.getEntry("folder/file.txt");
if (entry == null) entry = zip.getEntry("folder\\file.txt");
try (InputStream is = zip.getInputStream(entry)) {
try (Scanner s = new Scanner(is, "UTF-8").useDelimiter("\\A")) {
fileAsString = s.hasNext() ? s.next() : "";
}
}
}
Technically there is no such thing as directories inside a Zip-file. Everything in a Zip-file is basically an entry (ZipEntry
in Java). One can use the isDirectory
method to determine, if the current entry is representing a directory of the zipped file system structure or a regular file. The name attribute of a ZipEntry
always reflects the full directory hierarchy of the originally zipped file relative to the archive's root. That is for a file Data\Folder1\example.txt
you will have 3 ZipEntries
in your zip file. One for Data
, one Data\Folder1
and one Data\Folder1\example.txt
.
By simply iterating over the ZipEntries
of your ZipFile
and matching the path and file name of your desired file, you should easily find the desired entry. The contents of this entry can than be extracted using the already suggested ZipFile.getInputStream(ZipEntry)
method.
See this questions and the answers for examples on how to read an InputStream
to string.
Using Apache Commons-IO (IOUtils
) for reading the InputStream
to string this could look something like this:
public String getFileContentsAsString(final File pZipFile, final String pFileName) throws Exception {
try (ZipFile zipFile = new ZipFile(pZipFile)) {
Enumeration<? extends ZipEntry> entries = zipFile.entries();
while (entries.hasMoreElements()) {
ZipEntry currentEntry = entries.nextElement();
if (matchesDesiredFile(pFileName, currentEntry)) {
try (InputStream entryIn = zipFile.getInputStream(currentEntry)) {
String text = IOUtils.toString(entryIn, Charsets.UTF_8);
return text;
}
}
}
}
return null;
}
private boolean matchesDesiredFile(final String pFileName, final ZipEntry pZipEntry) {
return !pZipEntry.isDirectory() && pZipEntry.getName().equals(pFileName);
}
If you're simply matching against the name attribute of the entry, you could of course as well use
ZipEntry zipEntry = zipFile.getEntry(filePathWithinZipArchive);
To get the desired entry instead of iterating over the entries "manually".
Note that you should be carefull about the separator character used for directories. As pointed out here, it's up to the application that creates the zip file to either use \
(backslash) or /
(forward slash) as directory separator character. I tried this on a Mac using the zip
terminal command and both the ZipEntry
's name an the original file name were Data/Folder1/example.txt
. If you create the zip using a different tool the name of the ZipEntry
might be Data\Folder1\example.txt
. Even mixed variants (one ZipEntry
using forward- and anotherone using backward slashes) are possible. You may want to consider this, if you have no control over the zip creation process.