So I have written a script which import data from a csv file with more than 1.5m rows into a mysql innodb table, it will scan the CSV, then for each CSV row:
Then go to next row in CSV and do the above steps again.
The problem is, the table which data will be imported to also has 1.5m records, which make it too slow to import data. I tried the script and see that:
Is there any solution to make the import progress faster?
The root of the problem is this:
You search the table for the key you're about to insert, however when you perform an insert the database management system will also perform integrity checks, i.e. ensure there's no NULL
values on columns marked as NOT NULL
, check if foreign key constraints are not violated, and (most important in this case), ensure primary keys and unique fields are unique.
This means that the DBMS will (among other things) search for the item when you perform the INSERT
making your earlier search redundant.
You can replace the 3 operations with a single one which would look something like:
INSERT INTO tablename(column1,column2...)
VALUES (v1,v2,....)
ON DUPLICATE KEY UPDATE SET column1=v1...
Also useful is to know that if you say column1=VALUES(column1)
this means "use the value that I've specified in the VALUES()
part of the insert.