Is there a way to access the file inside archive while ignoring file name case using TrueZip?
Imagine following zip archive with content:
MyZip.zip
-> myFolder/tExtFile.txt
-> anotherFolder/TextFiles/file.txt
-> myFile.txt
-> anotherFile.txt
-> OneMOREfile.txt
This is how it works:
TPath tPath = new TPath("MyZip.zip\\myFolder\\tExtFile.txt");
System.out.println(tPath.toFile().getName()); //prints tExtFile.txt
How to do the same but ignore all case, like this:
// note "myFolder" changed to "myfolder" and "tExtFile" to "textfile"
TPath tPath = new TPath("MyZip.zip\\myfolder\\textfile.txt");
System.out.println(tPath.toFile().getName()); // should print tExtFile.txt
Code above throws FsEntryNotFoundException ... (no such entry)
It works for regular java.io.File
, not sure why not for TFile
of TrueZip or I am missing something?
My goal is to access each file just using only lowercase for files and folders.
Edit: 24-03-2017
Let's say I would like to read bytes from file inside mentioned zip archive MyZip.zip
Path tPath = new TPath("...MyZip.zip\\myFolder\\tExtFile.txt");
byte[] bytes = Files.readAllBytes(tPath); //returns bytes of the file
This snippet above works, but this one below does not (throws mentioned -> FsEntryNotFoundException
). It is the same path and file just in lowercase.
Path tPath = new TPath("...myzip.zip\\myfolder\\textfile.txt");
byte[] bytes = Files.readAllBytes(tPath);
You said:
My goal is to access each file just using only lowercase for files and folders.
But wishful thinking will not get you very far here. As a matter of fact, most file systems (except Windows types) are case-sensitive, i.e. in them it makes a big difference if you use upper- or lower-case characters. There you can even have the "same" file name in different case multiple times in the same directory. I.e. it actually makes a difference if the name is file.txt
, File.txt
or file.TXT
. Windows is really an exception here, but TrueZIP does not emulate a Windows file system but a general archive file system which works for ZIP, TAR etc. on all platforms. Thus, you do not have a choice whether you use upper- or lower-case characters, but you have to use them exactly as stored in the ZIP archive.
Update: Just as a little proof, I logged into a remote Linux box with an extfs file system and did this:
~$ mkdir test
~$ cd test
~/test$ touch file.txt
~/test$ touch File.txt
~/test$ touch File.TXT
~/test$ ls -l
total 0
-rw-r--r-- 1 group user 0 Mar 25 00:14 File.TXT
-rw-r--r-- 1 group user 0 Mar 25 00:14 File.txt
-rw-r--r-- 1 group user 0 Mar 25 00:14 file.txt
As you can clearly see, there are three distinct files, not just one.
And what happens if you zip those three files into an archive?
~/test$ zip ../files.zip *
adding: File.TXT (stored 0%)
adding: File.txt (stored 0%)
adding: file.txt (stored 0%)
Three files added. But are they still distince files in the archive or just stored under one name?
~/test$ unzip -l ../files.zip
Archive: ../files.zip
Length Date Time Name
--------- ---------- ----- ----
0 2017-03-25 00:14 File.TXT
0 2017-03-25 00:14 File.txt
0 2017-03-25 00:14 file.txt
--------- -------
0 3 files
"3 files", it says - quod erat demonstrandum.
As you can see, Windows is not the whole world. But if you copy that archive to a Windows box and unzip it there, it will only write one file to a disk with NTFS or FAT file system - which one is a matter of luck. Very bad if the three files have different contents.
Update 2: Okay, there is no solution within TrueZIP for the reasons explained in detail above, but if you want to work around it, you can do it manually like this:
package de.scrum_master.app;
import de.schlichtherle.truezip.nio.file.TPath;
import java.io.IOException;
import java.net.URISyntaxException;
import java.nio.file.Files;
public class Application {
public static void main(String[] args) throws IOException, URISyntaxException {
TPathHelper tPathHelper = new TPathHelper(
new TPath(
"../../../downloads/powershellarsenal-master.zip/" +
"PowerShellArsenal-master\\LIB/CAPSTONE\\LIB\\X64\\LIBCAPSTONE.DLL"
)
);
TPath caseSensitivePath = tPathHelper.getCaseSensitivePath();
System.out.printf("Original path: %s%n", tPathHelper.getOriginalPath());
System.out.printf("Case-sensitive path: %s%n", caseSensitivePath);
System.out.printf("File size: %,d bytes%n", Files.readAllBytes(caseSensitivePath).length);
}
}
package de.scrum_master.app;
import de.schlichtherle.truezip.file.TFile;
import de.schlichtherle.truezip.nio.file.TPath;
import java.io.IOException;
import java.net.URISyntaxException;
import java.nio.file.Path;
public class TPathHelper {
private final TPath originalPath;
private TPath caseSensitivePath;
public TPathHelper(TPath tPath) {
originalPath = tPath;
}
public TPath getOriginalPath() {
return originalPath;
}
public TPath getCaseSensitivePath() throws IOException, URISyntaxException {
if (caseSensitivePath != null)
return caseSensitivePath;
final TPath absolutePath = new TPath(originalPath.toFile().getCanonicalPath());
TPath matchingPath = absolutePath.getRoot();
for (Path subPath : absolutePath) {
boolean matchFound = false;
for (TFile candidateFile : matchingPath.toFile().listFiles()) {
if (candidateFile.getName().equalsIgnoreCase(subPath.toString())) {
matchFound = true;
matchingPath = new TPath(matchingPath.toString(), candidateFile.getName());
break;
}
}
if (!matchFound)
throw new IOException("element '" + subPath + "' not found in '" + matchingPath + "'");
}
caseSensitivePath = matchingPath;
return caseSensitivePath;
}
}
Of course, this is a little ugly and will just get you the first matching path if there are multiple case-insensitive matches in an archive. The algorithm will stop searching after the first match in each subdirectory. I am not particularly proud of this solution, but it was a nice exercise and you seem to insist that you want to do it this way. I just hope you are never confronted with a UNIX-style ZIP archive created on a case-sensitive file system and containing multiple possible matches.
BTW, the console log for my sample file looks like this:
Original path: ..\..\..\downloads\powershellarsenal-master.zip\PowerShellArsenal-master\LIB\CAPSTONE\LIB\X64\LIBCAPSTONE.DLL
Case-sensitive path: C:\Users\Alexander\Downloads\PowerShellArsenal-master.zip\PowerShellArsenal-master\Lib\Capstone\lib\x64\libcapstone.dll
File size: 3.629.294 bytes