Search code examples
pythonsqlinsertpypyodbc

Python SQL Single Commit vs Multiple Commit In With


I'm new to Python so I've been hacking together script from what I find and it's been going well (aka my scripts are working even if they're not the most elegant or correct). My ultimate goal is to take a csv, read it, find a phase, and then if that phrase is found write that row to a SQL database. I have this working but my question is this:

Should I be writing an .execute() then .commit() inside of every loop of the with (which is what I'm doing now)? Or should I build a single statement and execute/commit at the end (unsure how to do this)? I read through this post and I see how to format the insert, but not sure how to take my code (below) and adjust to build a single insert with multiple values. Any help or advice on is the first method just fine or the proper way is the second method would be great! Thanks in advance!

Background: I'm using pypyodbc and a VM. The python script is on the same VM as the SQL Express database.

Here's my code:

with open(CSVFILE, 'rt', encoding='utf8') as f:
    if debug:
        print('Inside the with')
    reader = csv.reader(f, delimiter=',')
    columns = next(reader)
    query = 'insert into TABLENAME ({0}) values ({1})'
    query = query.format(','.join(columns), ','.join('?' * len(columns)))
    for row in reader:
         for field in row:
             if data.strSearch in field:
                 if debug:
                     print('Found: {}'.format(row[0]))
                 cursor.execute(query, row)
                 cursor.commit()

Solution

  • The way you are doing it looks okay, although you could move the .commit() outside of the loop so you only do one commit at the end. That would probably be a bit faster.

    As for the suggestion to add the rows to a list and then use .executemany, with pypyodbc that approach will result in separate INSERT statements being sent to the server anyway, so it probably won't make much difference. However, pyodbc has a fast_executemany option that would likely provide much better performance compared to your current solution.