Search code examples
pythonmysqlsql-updatemysql-pythonglob

Loop not working for sql update statement (mysqldb)


I have a folder called 'testfolder' that includes two files -- 'Sigurdlogfile' and '2004ADlogfile'. Each file has a list of strings called entries. I need to run my code on both of them and am using glob to do this. My code creates a dictionary for each file and stores data extracted using regex where the dictionary keys are stored in commonterms below. Then it inserts each dictionary into a mysql table. It does all of this successfully, but my second sql statement is not inserting how it should (per file).

import glob
import re
files = glob.glob('/home/user/testfolder/*logfile*')

commonterms = (["freq", "\s?(\d+e?\d*)\s?"],
               ["tx", "#txpattern"],
               ["rx", "#rxpattern"], ...)

terms = [commonterms[i][0] for i in range(len(commonterms))]
patterns = [commonterms[i][1] for i in range(len(commonterms))]

def getTerms(entry):
    for i in range(len(terms)):
        term = re.search(patterns[i], entry)
        if term:
            term = term.groups()[0] if term.groups()[0] is not None else term.groups()[1]
        else:
            term = 'NULL'
        d[terms[i]] += [term]
    return d

for filename in files:
    #code to create 'entries'
    objkey = re.match(r'/home/user/testfolder/(.+?)logfile', filename).group(1)

    d = {t: [] for t in terms}

    for entry in entries:
        d = getTerms(entry)

    import MySQLdb
    db = MySQLdb.connect(host='', user='', passwd='', db='')
    cursor = db.cursor()
    cols = d.keys()
    vals = d.values()

    for i in range(len(entries)):
        lst = [item[i] for item in vals]
        csv = "'{}'".format("','".join(lst))
        sql1 = "INSERT INTO table (%s) VALUES (%s);" % (','.join(cols), csv.replace("'NULL'", "NULL"))
        cursor.execute(sql1)

#now in my 2nd sql statement I need to update the table with data from an old table, which is where I have the problem...

    sql2 = "UPDATE table, oldtable SET table.key1 = oldtable.key1, 
table.key2 = oldtable.key2 WHERE oldtable.obj = %s;" % repr(objkey)
    cursor.execute(sql2)

    db.commit()
    db.close()

The problem is that in the second sql statement, it ends up inserting that data into all columns of the table from only one of the objkeys, but I need it to insert different data depending on which file the code is currently running on. I can't figure out why this is, since I've defined objkey inside my for filename in files loop. How can I fix this?


Solution

  • Instead of doing separate INSERT and UPDATE, do them together to incorporate the fields from the old table.

    for i in range(len(entries)):
        lst = [item[i] for item in vals]
        csv = "'{}'".format("','".join(lst))
        sql1 = """INSERT INTO table (key1, key2, %s) 
                SELECT o.key1, o.key2, a.*
                FROM (SELECT %s) AS a
                LEFT JOIN oldtable AS o ON o.obj = %s""" % (','.join(cols), csv.replace("'NULL'", "NULL"), repr(objkey))
        cursor.execute(sql1)