Search code examples
ruby-on-railscsvactiverecordbulkinsertactiverecord-import

active record-import is creating duplicate records


I am using active-record import gem in a class method to import array of listings read from csv file as shown in code below:

def self.import_listings(file)  
        start = Time.now    
        count = 0
        valid_listings = Array.new

        CSV.foreach(file.path, headers: true, encoding:'iso-8859-1:utf-8') do |row|

            data = {}

            row.to_hash.each do |k, v|
                key = MAP[k]
                data[key] = v
            end

            unless data[:vin] == nil            
                listing =  Listing.new                      
                listing.title = "#{data[:year]} #{data[:make]} #{data[:model]}"


                listing.approved = true


                unless data[:all_images] == nil

                    listing_images = data[:all_images].split(",")
                    i = 0

                    [:image, :imagefront, :imageback, :imageleft, :imageright, :frontinterior, :rearinterior].each do |image|               
                        unless listing_images.size < 1
                            data[image] = CsvUploading::picture_from_url(listing_images[i])
                            i += 1
                        end

                    end

                end


                data.delete(:all_images)

                data[:approved] = true


                listing.attributes = data


                valid_listings << listing

             end
        end



        begin

            Listing.import valid_listings, on_duplicate_key_update: { conflict_target: [:title, :vin], columns: [user_id: :user_id, newused: :newused]}# , :stocknumber, :model, :year, :trim, :miles, :enginedescription,:cylinder,:fuel,:transmission,  :price, :color, :interiorcolor, :options, :description, :image, :image, :imagefront, :imageback, :imageleft, :imageright, :frontinterior, :rearinterior]

        rescue
            p "some issue"
        end

        finish = Time.now
        puts diff = finish - start                          
    end

As per active-record import documentation, i am trying to set a watcher on the title and VIN fields of listing as conflictied targets.In case of conflict of VIN field of a listing, I want to do an update instead of create.

But right now, it's creating a new listing from Listing.import every time I am running a CSV upload and not checking if its conflicted.

Where am I going wrong?

enter image description here


Solution

  • Your screenshot shows that you are using SQLite, but the documentation says that this feature is only supported for MySQL and PostgreSQL. You will need to use a supported database!


    My first answer, in case it's useful to someone in the future:

    The On Duplicate Key Update behaviour depends on there being a unique constraint on those columns in the database itself – because you're seeing duplicate records, it's safe to assume that you don't have such a constraint.

    As the database isn't reporting a conflict on a unique constraint, there's no reason it would update an existing row! The answer in this Stack Overflow question, "How to correctly do upsert in postgres 9.5" explains this in a bit more depth and shows the kind of query that is being issued.

    You should be able to solve this by adding a unique constraint or index on the VIN and title columns, for example a Rails migration that does something like this:

    class AddUniqueIndexToListingsOnVinAndTitle < ActiveRecord::Migration
      def change
        add_index :listings, [:vin, :title], unique: true
      end
    end
    

    Some more info you might find useful (not sure whether you're using PostgreSQL or MySQL):