Search code examples
pythonexcelsqlitepython-collections

Add parameters to a short hand for loop in Python


I am very sorry that I was not able to give a better Headline for my problem. I am very new to Python programming and I have to make a small change to the existing code in an application. The current python code reads each and every row and each column in an Excel sheet and stores it as it is in the DB in a table named as "Commits". DB we are using is SQL Lite and python library we are using for it is sqlite3.py In the existing code only 6 columns are inserted in DB as it is from the excel sheet. Following is the code for it:-

def constructjenkinsdata(filepath):
    return [(jenkinsentry[0], jenkinsentry[1], jenkinsentry[3],jenkinsentry[2],jenkinsentry[4],jenkinsentry[5],str(dateparser.parse(jenkinsentry[6],ignoretz=True)),jenkinsentry[7]) for jenkinsentry in csvrowgenerator(filepath)]

def loadjenkinsdata(jenkinssource, concurrency=3):
    p = Pool(concurrency)
    jenkinsdataset_lists = p.map(constructjenkinsdata, jenkinssource)
    jenkinsdataset = list(chain.from_iterable(jenkinsdataset_lists))
    persistjenkinsdata(jenkinsdataset, batchid)

In the above code persistjenkinsdata is a function which simply passes an insert query to the DB to insert the data set.

FOllowing is the code of csvrowgenerator:-

def csvrowgenerator(filepath):
with open(filepath, encoding="utf8") as f:
     for row in csv.reader(f):
         yield row

Now my requirement is that while reading each column in excel sheet, we have to pass the second column as input to an Select Join query which will return a set of two columns and each column value needs to be passed in the above collection of the short handed for loop before inserting it into the DB.

I have written a function that returns the respective values after fetching the values from DB, that function is returning the output correctly when I pass a static input. But I don't know how to pass the inputs dynamically from the above for loop to that function. How to change the below lines of code to include two more parameters to be passed into the data set. WHat I need is as follows :- return [(jenkinsentry[0], jenkinsentry[1], jenkinsentry[3],jenkinsentry[2],jenkinsentry[4],jenkinsentry[5],str(dateparser.parse(jenkinsentry[6],ignoretz=True)),jenkinsentry[7], {Value-1 getting returned from DB, Input is jenkinsentry[2]} , {Value-2 returned from DB, input is jenkinsentry[2]}) for jenkinsentry in csvrowgenerator(filepath)]

I am very new to python so , I dont know how to change this short hand for loops to change the collection accordingly. I tried the following :-

def constructjenkinsdata(filepath):
    columns = []
    for jenkinsentry in csvrowgenerator(filepath):
      buildNo = jenkinsentry[0]
      #print('Build NO .. ' + buildNo)
      columns.append(buildNo)
      url = jenkinsentry[1]
      columns.append(url)
      #print('URL....'+url)
      testCaseName = jenkinsentry[2]
      columns.append(testCaseName)
      className = jenkinsentry[3]
      columns.append(className)
      crid = jenkinsentry[4]
      columns.append(crid)
      errorDetails = jenkinsentry[5]
      columns.append(errorDetails)
      createTime = str(dateparser.parse(jenkinsentry[6],ignoretz=True))
      columns.append(createTime)
      print('Create time.....' + str(createTime))
      status = jenkinsentry[7]
      columns.append(status)
      print('Test Case Name....' + testCaseName)
      lastFailedBuildIDAndCreateTime = returnbuildIDAndCreatedateSet(testCaseName)
      lastFailedBuildID = str(lastFailedBuildIDAndCreateTime[0])
      print('Last Failed build ID .....' + str(lastFailedBuildID))
      columns.append(lastFailedBuildID)
      lastFailedCreateTimeString = str(lastFailedBuildIDAndCreateTime[1])
      columns.append(lastFailedCreateTimeString)
      print('Last Failed Create Time .....' + str(lastFailedCreateTimeString))
      #jenkinsColumnCollection.append(columns)
      return columns

But the above code only runs once and does not iterate through all the items in the row. WHen I try using yield instead of return. It gives an error in the follow line :- jenkinsdataset = list(chain.from_iterable(jenkinsdataset_lists))

The below the the output of print statement which is returned from my code of For loop :-

jenkins data set.. First Item.... [['922', 'sdsadad', 'Test Suite Hook s', 'Test Suite Hooks', '', '', '2018-05-23 01:00:00', 'pass', 'None', 'None']]

Following is how the output is getting formed with the actual code of "return [(jenkinsentry[0], jenkinsentry[1], jenkinsentry[3],jenkinsentry[2],jenkinsentry[4],jenkinsentry[5],str(dateparser.parse(jenkinsentry[6],ignoretz=True)),jenkinsentry[7]) for jenkinsentry in csvrowgenerator(filepath)]"

jenkins data set.. First Item.... [[('922', 'sdsadad', 'Test Suite Hooks', 'Test Suite Hooks', '', '', '2018-05-23 01:00:00', 'pass', '', ''), ('922', 'sdsadad', 'abc/ssdsd', 'Quebjhfghjhdg aeuyruyiyd', '', '', '2018-05-23 01:00:00', 'pass', '', ''), ('922', 'sdsadad', 'abc/ssdsd', 'Quebjhfghjhdg aeuyruyiyd', '', '', '2018-05-23 01:00:00', 'pass', '', '')]]

Please help me form a collection, which takes input from the exisiting for loop and gets parsed correctly into list(chain.from_iterable(jenkinsdataset_lists))


Solution

  • Let's have a look at your code

    def constructjenkinsdata(filepath):
        columns = []
        for jenkinsentry in csvrowgenerator(filepath):
          ...
          return columns
    

    You return columns at the end of your first iteration. It should be:

    def constructjenkinsdata(filepath):
        columns = []
        for jenkinsentry in csvrowgenerator(filepath):
           ...
        return columns