i'm using Apache Commons Compress for Java to compress multiple log files to a single tar.bz2
archive.
However, it takes really long (> 12 hours) to compress, because i compress around 20GB of files a day.
As this library compresses files mono-threaded, i'd like to know if there is a way to do this multi-threaded.
I found many solutions (Commandline pbzip2 or some C++ libraries) but all i found for java is this blog post:
https://plus.google.com/117421466255362255970/posts/3jfKVu325zh
It seems that i can't use it in my Java application.
Is there anything out there? What would you recommend? Or is there another faster solution with similar compression rates like bzip2 ?
As you have multiple files, you can compress each file in a different thread. As your process is CPU bound, I suggest creating a fixed size thread pool i.e. an ExecutorService, and adding a task for each file to compress.
Note: if pbzip2 does what you want, I would call it from Java. You might find it is fast for even one thread as the BZIP2 libraries I have seen for Java are natively implemented (unlike JAR, ZIP and GZIP)