Search code examples
ruby-on-railsrubyrakeunzip

Rake task to download and unzip


I would like to update a cities table every week to reflect changes in cities across the world. I am creating a Rake task for the purpose. If possible, I would like to do this without adding another gem dependency.

The zipped file is a publicly available zipped file at geonames.org/15000cities.zip.

My attempt:

require 'net/http'
require 'zip'

namespace :geocities do
  desc "Rake task to fetch Geocities city list every 3 days"
  task :fetch do

    uri = URI('http://download.geonames.org/export/dump/cities15000.zip')
    zipped_folder = Net::HTTP.get(uri) 

    Zip::File.open(zipped_folder) do |unzipped_folder| #erroring here
      unzipped_folder.each do |file|
        Rails.root.join("", "list_of_cities.txt").write(file)
      end
    end
  end
end

The return from rake geocities:fetch

rake aborted!
ArgumentError: string contains null byte

As detailed, I'm trying to unzip the file and save it to a list_of_cities.txt file. Once I the methodology down for accomplishing this, I believe I can figure out how to update my db, based on the file. (But if you have opinions on how best to handle the actual db update, other than my planned way, I'd love to hear them. But that seems like a different post entirely.)


Solution

  • This will save zipped_folder to disk, then unzip it and save its contents:

    require 'net/http'                                                              
    require 'zip'                                                                   
    
    namespace :geocities do                                                         
      desc "Rake task to fetch Geocities city list every 3 days"                    
      task :fetch do                                                                
    
        uri = URI('http://download.geonames.org/export/dump/cities15000.zip')                          
        zipped_folder = Net::HTTP.get(uri)                                          
    
        File.open('cities.zip', 'wb') do |file|                                      
          file.write(zipped_folder)                                                 
        end                                                                         
    
        zip_file = Zip::File.open('cities.zip')                                     
        zip_file.each do |file|                                                     
          file.extract
        end                                                                         
      end                                                                           
    end
    

    This will extract all files inside the zip file, in this case cities15000.txt.
    You can then read the contents of cities15000.txt and update your database.

    If you want to extract to a different file name, you can pass it to file.extract like this:

    zip_file.each do |file|                                                     
        file.extract('list_of_cities.txt')
    end