Search code examples
mysqlruby-on-railsxmlparsinghpricot

Populating XML file data to db in Rails


In order to work with some client information I have been parsing a 3MB XML file with hpricot... but hpricot takes quite some time to parse the file on a regular basis.

I am thinking about populating this data to a MySql db (once a week) so that I can work with the data directly on mysql with rails.

The file is basically is an XML Google Contacts File that contains client information: Name, email, notes... but also some contact contain multiple value fields such as addresses, telephones.

Currently when I am parsing the data I generate a Contact class

class Contact <
  Struct.new(:name, :email, :telephones, :addresses, :user_address,:notes)
end

telephones and addresses contain an array with the different values.

I guess that if I want to recreate this structure in the mysql database I would need to create three tables: contacts, telephones and addresses...

class Contact < ActiveRecord::Base
  has_many :addresses
  has_many :telephones
end

class Telephone < ActiveRecord::Base
  belongs_to :contact
end

class Address < ActiveRecord::Base
  belongs_to :contact
end

How would you to populate the Contact class data to the database tables? Is there a way to insert the data directly from the XML file to the database tables?

Any advice and guidance will be greatly appreciated :) Thanks!


Solution

  • First why not give nokogiri a try and see if its faster ?

    Rails thought people best practices and they came to believe that there is a recipe on how one should program for any given problem. Unfortunately this is not the case, for the usual 90% of tasks there is no magic.

    So if you have a contact with some addresses and some telephones it's just that.

    Here it's how I would do:

    Parse the XML file, if it's too big, stream the parsing.
    For each contact in it output a hash just like the params[:contact] would usually turn out in a controller after a form is submitted, and have the Contact model use accepts_nested_attributes_for.

    contact = {
      :name => xxx, 
      :user_address => xxx, 
      :notes => xxx
      :addresses_attributes => [
        {:some_attribute => xxx, :some_other_attribute => xxx}
        ...
      ],
      :telephones_attributes => [
        { :some_attribute => xxx, :some_other_attribute => xxx}
        ...
      ]
    }
    

    Now all that remains is:

    Contact.create(contact)