I want to save the contents of a tar.gz archive inside a database table.
The archive contains txt files in CSV format.
The idea is to insert a new line in the database for each line in the txt files.
The problem is that I can't read the contents of a file separately then move on to the next file.
Below EntryTable and EntryTableLine are Hibernate entities.
EntryTable is in a OneToMany relationship with EntryTableLine (a file -EntryTable- can have many lines -EntryTableLine-).
public static final int TAB = 9;
FileInputStream fileInputStream = new FileInputStream(fileLocation);
GZIPInputStream gzipInputStream = new GZIPInputStream(fileInputStream);
TarArchiveInputStream tar = new TarArchiveInputStream(gzipInputStream);
BufferedReader reader = new BufferedReader(new InputStreamReader(tar));
// Columns are delimited with TAB
CSVFormat csvFormat = CSVFormat.TDF.withHeader().withDelimeter((char) TAB);
CSVParser parser = new CSVParser(reader, csvFormat);
TarArchiveEntry tarEntry = tar.getNextTarEntry();
while(tarEntry != null){
EntryTable entryTable = new EntryTable();
entryTable.setFilename(tarEntry.getName());
if(reader != null){
// Here is the problem
for(CSVRecord record : parser){
//this could have been a StringBuffer
String line;
int i = 1;
for(String val : record){
line = "<column" + i + ">" + val + "</column" + i + ">";
}
EntryTableLine entryTableLine = new EntryTableLine();
entryTableLine.setContent(line);
entryDao.saveLine(entryTableLine);
}
}
tarEntry = tar.getNextTarEntry();
}
I tried converting tarEntry.getFile() to InputStream, but tarEntry.getFile() is null unfortunately.
Let's say I have 4 files in the archive. Each file has 3 lines inside. However, in the database, some entries have 5 lines while others have none.
Thank you !
Doing something similar to this solved the problem:
TarArchiveEntry entry = tarInput.getNextTarEntry();
byte[] content = new byte[entry.getSize()];
LOOP UNTIL entry.getSize() HAS BEEN READ {
tarInput.read(content, offset, content.length - offset);
}