Search code examples
rubycsvbzip2

Compressing using Bzip2 on-the-fly to a file?


There is a program that generates huge CSV files. For example:

arr = (0..10).to_a
CSV.open("foo.csv", "wb") do |csv|
  (2**16).times { csv << arr }
end

It will generate a big file, so I want to be compressed on-the-fly, and, instead of output a non-compressed CSV file (foo.csv), output a bzip-compressed CSV file (foo.csv.bzip).

I have an example from the "ruby-bzip2" gem:

writer = Bzip2::Writer.new File.open('file')
writer << 'data1'
writer.close

I am not sure how to compose Bzip2 write from the CSV one.


Solution

  • You can also construct a CSV object with an IO or something sufficiently like an IO, such as a Bzip2::Writer.

    For example

    File.open('file.bz2', 'wb') do |f|
      writer = Bzip2::Writer.new f
      CSV(writer) do |csv|
        (2**16).times { csv << arr }
      end
      writer.close
    end