Search code examples
unixcompressionbzip2

Does bzip2 creates an output file even if the command was interrupted/canceled while in progress?


I was logged in via SSH to a remote machine and compressing a single 90GB XML file there using:

bzip2 myfile.xml

My connection timed out, so I am not sure whether bzip2 worked, but I ended up with an output file myfile.xml.bz2.

When a bzip2 command fails to fully execute, does it save an output file or not?


Solution

  • A more appropriate question would be whether it cleans up after itself.

    BZip2 compresses data in relatively small blocks, and outputs each block before it proceeds to the next one. This allows it to run on memory-constrained systems and still process practical amounts of data (it needs less than 8MB RAM to handle even that 90 GB XML of yours).

    If you inspect the source file bzip2.c, you can notice that it does clean up in function void cleanUpAndFail(Int32 ec), assuming the input file still exists. Of course, if the program got killed before it could run to completion, it wouldn't be able to do that.


    In your case, if myfile.xml still exists and you didn't explicitly tell bzip2 to keep it (seems you didn't), then bzip2 likely got killed before completing. If it's gone, then it likely completed without issues. You could use bzip2 -tv to run an integrity test on it.