I'm using Rails 5.2 with ruby 2.5.1 and am deploying my app to Heroku.
I ran into problems when I tried running my local rake task. The task calls an API which responds with a *.gz
file, saves it, upzips and then uses the retrieved JSON to populate the database and finally deletes the *.gz
file. The task runs smooth in development but when called in production. The last line printed into the console is 'Unzipping the file...', so my guess is that the issues origin from the zlib
library.
companies_list.rake
require 'json'
require 'open-uri'
require 'zlib'
require 'openssl'
require 'action_view'
include ActionView::Helpers::DateHelper
desc 'Updates Company table'
task update_db: :environment do
start = Time.now
zip_file_url = 'https://example.com/api/download'
TEMP_FILE_NAME = 'companies.gz'
puts 'Creating folders...'
tempdir = Dir.mktmpdir
file_path = "#{tempdir}/#{TEMP_FILE_NAME}"
puts 'Downloading the file...'
open(file_path, 'wb') do |file|
open(zip_file_url, { ssl_verify_mode: OpenSSL::SSL::VERIFY_NONE }) do |uri|
file.write(uri.read)
end
end
puts 'Download complete.'
puts 'Unzipping the file...'
gz = Zlib::GzipReader.new(open(file_path))
output = gz.read
@companies_array = JSON.parse(output)
puts 'Unzipping complete.'
(...)
end
Has anyone else run into similar issues and knows how to get it to work?
The issue was linked to memory limit rather than Gzip unpacking (that's why the problem only occurred in production).
The solution was using a Json::Streamer
so that the whole file is not loading into memory at once.
This is the crucial part: (goes after the code posted in the question)
puts 'Updating the Company table...'
streamer = Json::Streamer.parser(file_io: file, chunk_size: 1024) # customize your chunk_size
streamer.get(nesting_level: 1) do |company|
(do your stuff with the API data here...)
end
end