I'm looking for a general compression library that supports random access during decompression. I want to compress wikipedia into a single compressed format and at the same time I want to decompress/extract individual articles from it.
Of course, I can compress each articles individually, but this won't give much compression ratio. I've heard LZO compressed file consists of many chunks which can be decompressed separately, but I haven't found out API+documentation for that. I can also use the Z_FULL_FLUSH mode in zlib, but is there any other better alternative?
xz-format files support an index, though by default the index is not useful. My compressor, pixz, creates files that do contain a useful index. You can use the functions in the liblzma library to find which block of xz data corresponds to which location in the uncompressed data.