Search code examples
ruby-on-railsrubyherokurakezlib

Heroku: Unpacking a Gzip file through a rake task fails


I'm using Rails 5.2 with ruby 2.5.1 and am deploying my app to Heroku.

I ran into problems when I tried running my local rake task. The task calls an API which responds with a *.gz file, saves it, upzips and then uses the retrieved JSON to populate the database and finally deletes the *.gz file. The task runs smooth in development but when called in production. The last line printed into the console is 'Unzipping the file...', so my guess is that the issues origin from the zlib library.

companies_list.rake

require 'json'
require 'open-uri'
require 'zlib'
require 'openssl'
require 'action_view'

include ActionView::Helpers::DateHelper

desc 'Updates Company table'
task update_db: :environment do
  start = Time.now
  zip_file_url = 'https://example.com/api/download'

  TEMP_FILE_NAME = 'companies.gz'

  puts 'Creating folders...'

  tempdir = Dir.mktmpdir
  file_path = "#{tempdir}/#{TEMP_FILE_NAME}"

  puts 'Downloading the file...'

  open(file_path, 'wb') do |file|
    open(zip_file_url, { ssl_verify_mode: OpenSSL::SSL::VERIFY_NONE }) do |uri|
      file.write(uri.read)
    end
  end

  puts 'Download complete.'
  puts 'Unzipping the file...'

  gz = Zlib::GzipReader.new(open(file_path))
  output = gz.read
  @companies_array = JSON.parse(output)

  puts 'Unzipping complete.'

  (...)
end

Has anyone else run into similar issues and knows how to get it to work?


Solution

  • The issue was linked to memory limit rather than Gzip unpacking (that's why the problem only occurred in production).

    The solution was using a Json::Streamer so that the whole file is not loading into memory at once.

    This is the crucial part: (goes after the code posted in the question)

      puts 'Updating the Company table...'
      streamer = Json::Streamer.parser(file_io: file, chunk_size: 1024)  # customize your chunk_size
      streamer.get(nesting_level: 1) do |company|
        (do your stuff with the API data here...)
      end
    end