Search code examples
rubyrefactoringgeneric-programming

How to get a class's attributes and map them to .csv headers in a generic way?


I have this function to load several .csv files to an array of objects:

def load_csv_data_to_order_objects
    orders = []
    get_data_paths("../my/path", ".csv").each do |path|
        CSV.foreach(path, :headers => :first_row, :col_sep => ',', encoding: "ISO8859-1:utf-8") do |row|
            orders.push Order.new(
                :date => row["ORDER_DATE"],
                :seller_id => row["SELLER_ID"],
                :order_number => row["ORDER_NUMBER"],
                :product_id => row["PRODUCT_ID"],
                :quantity => row["QUANTITY"].to_i,
                :sales_price => row["SALES_PRICE"].to_f,
            )
        end
    end
    orders
end

This works, but I need to load .csv files with a different number of columns into different types of objects. The general "shape" of the function is the same, but the object attributes differs.

To minimize code duplication, how can I create a more generic version of this function?

I imagine something like this:

def load_csv_data_to_objects(search_path, file_extension, class_name)
    objects = []
    get_data_paths(search_path, file_extension).each do |path|
        CSV.foreach(path, :headers => :first_row, :col_sep => ',', encoding: "ISO8859-1:utf-8") do |row|
            objects.push class_name.new(

                # How can I get a class's attributes and map them to .csv headers?

            )
        end
    end
    objects
end

Solution

  • The biggest problem, which I see, is, that your column headers are not named exactly like your model attributes. Example: 'date' vs. 'ORDER_DATE'. It is not trivially possible to guess, that the attribute date should be populated with the content of the column 'ORDER_DATE'. This makes the task way more complex.

    My first idea would therefore be to just define on each class a mapping like

    class Order < ActiveRecord::Base
    
      CSV_IMPORT_MAPPING = {
        date: 'ORDER_DATE',
        seller_id: 'SELLER_ID',
        # ...
      }
    

    and use this mapping to initialize the objects in a loop:

        CSV.foreach(path, :headers => :first_row, :col_sep => ',', encoding: "ISO8859-1:utf-8") do |row|
            objects.push class_name.new(class_name::CVS_IMPORT_MAPPING.tap { |hash| hash.map { |class_attr, csv_attr| hash[class_attr] = row[csv_attr] } })
          )
     end
    

    This will basically convert the CSV_IMPORT_MAPPING from

    { date: 'ORDER_DATE', seller_id: 'SELLER_ID' } 
    

    to a hash with the column content as values

    { date: '2016-12-01', seller_id: 12345 }
    

    and pass this hash to the new function to instantiate a new object.

    Not cool, but you get a nice overview, which csv column matches to which model attribute and you gain flexibility (you could for example decide to not import all columns, but just the ones from the hash).

    Another approach could be, to replace 'CSV.foreach' with 'CSV.read' to get access to the headers, iterate over the headers and use them to instantiate your object:

    csv = CSV.read(path, :headers => :first_row, :col_sep => ',', encoding: "ISO8859-1:utf-8") do |row|
        instance = class_name.new
        csv.headers.each do |header|
          instance.send("#{header.downcase}=", row[header])
        end
        objects.push instance
      )
    

    end

    'header.downcase' will convert 'ORDER_DATE' to 'order_date', e.g. you will need to add an setter for each column name, that differs from your model attributes, on your model (which i consider to be really ugly):

    class Object < ActiveRecord::Base
      alias_method :order_date=, :date=
    

    This would not be necessary if you could get a csv file, in which the column headers are named exactly like your model attributes.

    I hope, that one of the ideas works for you.