Search code examples
pythonmysqlbulkinsertbulk-load

MySql insert speed too slow


I am working on a project which is receiving information from GitHub API and analyze them. The problem is when ı try to insert this kind of huge data (for example in files I insert a list of lists as seen below which has 19k item in it) it takes a very long time. How can I improve that? Thanks.

       for i in commits_array:
            for j in i[-1]:
                self.insert_value_to_files_DB(j)


    def insert_value_to_files_DB(self, array):
    try:

        sql = "INSERT INTO files (file_count,file_sha,file_name,file_status,file_additions,file_deletions,file_changes,commit_sha) VALUES (%s, %s, %s, %s,%s, %s, %s,%s)"
        self.cursor.execute(sql, array)
        self.connection.commit()
    except mysql.connector.Error as error:
        print("Failed to insert into MySQL table {}".format(erro

Solution

  • 19,000 rows isn't huge for a dbms. It's small.

    Before you start inserting rows, do

    self.connection.start_transaction()
    

    Then after every hundred rows or so do

    self.connection.commit()
    self.connection.start_transaction()
    

    Then, when you're done do a final commit().

    On inserts, it's the commit operations that take time. Putting multiple rows into one commit makes things faster.