I'm working with a Rails application that is currently creating classes before and after the User class has been created or saved. The problem I'm experiencing with this path is that one class is reading from a very large S3 AWS bucket, there are over 7,000 objects I need to add to our database. This whole process takes about 32ms to create objects and 2911ms to add these objects to the Batch database.
The reason why I am adding this to our database instead of simply reading the AWS bucket is
1) to add properties to these objects
2) for the objects to be available for the iPhone application
I'd love to figure out a way for this bucket to be read and created into a database before the Rails application finishes loading or run the code in the background.
Here is my Batch.rb code:
class Batch < ActiveRecord::Base
serialize :folder, JSON
has_many :tops
has_many :bottoms
def access_bucket
AwsAccess.new('curateanalytics', [], "", {}).sort_through_bucket
end
class AwsAccess
def initialize(bucket_name, array, current, obj)
@bucket_name = bucket_name
@array = array
@current = current
@obj = obj
@newfolder
@newbatch
@newurl
end
def access_bucket
return AWS::S3.new.buckets[@bucket_name]
end
def sort_through_bucket
access_bucket.objects.each do |obj|
if obj_is_swipe_batch?(obj)
create_new_instances(obj)
if !obj_contains_key?
add_newfolder_key
end
if !current_equals_batch?
@current = @newbatch
if array_not_array?
@obj[@newfolder] << @array
end
end
if Properties.new(@newurl).find_properties[:main_category] == "Bottoms"
@bottom = Bottoms.create({:batch_folder => @newfolder, :batch_number => @newbatch , :file_name => @newurl.split("/").last.gsub("%26","&"), :url => @newurl, :properties => Properties.new(@newurl).find_properties})
end
if Properties.new(@newurl).find_properties[:main_category] == "Tops"
@top = Tops.create({:batch_folder => @newfolder, :batch_number => @newbatch, :file_name => @newurl.split("/").last.gsub("%26","&"), :url => @newurl, :properties => Properties.new(@newurl).find_properties})
end
end
end
end
def obj_is_swipe_batch?(obj)
return ((obj.key =~ /swipe batches/) && (obj.key =~ /jpg/))
end
def create_new_instances(obj)
@newfolder = obj.key.split("/")[1]
@newbatch = obj.key.split("/")[obj.key.split("/").length-2]
@newurl = "https://s3.amazonaws.com/curateanalytics/" + obj.key.gsub('&', '%26').gsub('swipe ', 'swipe+')
end
def obj_contains_key?
@obj.key?(@newfolder)
end
def add_newfolder_key
@obj.merge!(@newfolder => [])
end
def current_equals_batch?
@current == @newbatch
end
def array_not_array?
@array != []
end
end
class Properties
def initialize(bucket_url)
@bucket_url = bucket_url
@hash = {}
end
def find_properties
read_json
parse_json
parse_main
parse_sub
return @hash
end
def read_json
@json = JSON.parse(File.read(File.join(Rails.root, 'public', 'DatabaseArray.json')))
end
def parse_json
@json.each do |main|
@main = main
end
end
def parse_main
@main.each do |sub|
@sub = sub
end
end
def parse_sub
@sub.gsub("\"","")[1..-2].split(",").each do |properties|
@property = properties.split(":")
is_everything
end
end
def is_URL?
@property.first == "URL"
end
def is_File_Name?
@property.first == "File_Name"
end
def is_Main?
@property.second == "{Main_Category"
end
def is_everything
if !is_URL? && !is_File_Name? && is_Main?
hash_merge(@property.second.gsub!("{",""),@property.last)
elsif !is_URL? && !is_File_Name?
hash_merge(@property.first,@property.last)
end
end
def hash_merge(name, property)
@hash.merge!(name.parameterize.underscore.to_sym => property)
end
end
end
So far I've looked into putting this code as an initializer. I can access the /config/initializer/batch.rb file, it looks exactly the same as this batch.rb file using binding.pry but the code never runs.
I solved this problem by creating a custom rake task. I came to this conclusion from this post's second answer; after researching rakes and threads I decided running a rake task beforehand to populate my databases would work best as the S3 bucket I'm reading from will not change unless I allow it. For anyone looking into this question I needed to write
task :task_name => :environment do
DB0.connection
DB1.connection
...code...
end
and rewrite my code without methods.
Although running the rake file takes a short while my user sign-in on the website runs so much quicker.