Search code examples
pythonfor-loopexport-to-csv

Python Writer Skips first row


I'm new to python and don't know why I get this kind of error.

I have a csv file from which I read some data. I compare the data with another csv file and if I find similarities I want to copy some data from the second file. However here's the problem:

            with open('WeselVorlageRE5Woche.csv') as woche:
                with open('weselfund.csv','+a',newline='') as fund:

                    readCSV1 = csv.reader(woche, delimiter=';')
                    for row1 in readCSV1:   
                        if row[1]==row1[4]: #find starting time
                            if row[3]==row1[1]: # find same train
                                if row[2]=='cancelled': # condition for taking row
                                    zug=row1[6]     #copy trainnumber
                                    print(zug)
                                    for row2 in readCSV1:
                                        if row2[6]==zug: #find all trainnumbers
                                            #write data to csv
                                            writer = csv.writer(fund, delimiter=';')
                                            writer.writerow(row2)

In my second for loop it appears as if the first row is skipped. Every time the for loop starts, the first row of data isn't written in the new file. Dataset i read from Dataset that is written Can someone tell me why the first one is always missing? If I add a dummy-row in the dataset I read from I get exactly what I want written, but I don't want to add all dummies.


Solution

  • A csv reader gets 'used up' if you iterate over it. This is why the second loop doesn't see the first row, because the first loop has already 'used' it. We can show this by making a simple reader over a list of terms:

    >>> import csv
    >>> test = ["foo", "bar", "baz"]
    >>> reader = csv.reader(test)
    >>> for row in reader:
    ...     print(row)
    ... 
    ['foo']
    ['bar']
    ['baz']
    >>> for row in reader:
    ...     print(row)
    ... 
    >>> 
    
    

    The second time it prints nothing because the iterator has already been exhausted. If your dataset is not too large you can solve this by storing the rows in a list, and thus in memory, instead:

    data = [row for row in readCSV1]
    

    If the document is too big you will need to make a second file reader and feed it to a second csv reader.

    The final code becomes:

    with open('WeselVorlageRE5Woche.csv') as woche:
        with open('weselfund.csv','+a',newline='') as fund:
            readCSV1 = [row for row in csv.reader(woche, delimiter=';')]
            for row1 in readCSV1:   
                if row[1]==row1[4]: #find starting time
                    if row[3]==row1[1]: # find same train
                        if row[2]=='cancelled': # condition for taking row
                            zug=row1[6]     #copy trainnumber
                            print(zug)
                            for row2 in readCSV1:
                                if row2[6]==zug: #find all trainnumbers
                                    #write data to csv
                                    writer = csv.writer(fund, delimiter=';')
                                    writer.writerow(row2)
    

    with the solution to store it in memory. If you want to use a second reader instead, it becomes

    with open('WeselVorlageRE5Woche.csv') as woche:
        with open('weselfund.csv','+a',newline='') as fund:
            readCSV1 = [row for row in csv.reader(woche, delimiter=';')]
            for row1 in readCSV1:   
                if row[1]==row1[4]: #find starting time
                    if row[3]==row1[1]: # find same train
                        if row[2]=='cancelled': # condition for taking row
                            zug=row1[6]     #copy trainnumber
                            print(zug)
                            with open('WeselVorlageRE5Woche.csv') as woche2:
                                readCSV2 = csv.reader(woche2, delimiter=';')
                                for row2 in readCSV2:
                                    if row2[6]==zug: #find all trainnumbers
                                        #write data to csv
                                        writer = csv.writer(fund, delimiter=';')
                                        writer.writerow(row2)