Search code examples
pythoncsvcsv-header

How to skip the headers when processing a csv file using Python?


I am using below referred code to edit a csv using Python. Functions called in the code form upper part of the code.

Problem: I want the below referred code to start editing the csv from 2nd row, I want it to exclude 1st row which contains headers. Right now it is applying the functions on 1st row only and my header row is getting changed.

in_file = open("tmob_notcleaned.csv", "rb")
reader = csv.reader(in_file)
out_file = open("tmob_cleaned.csv", "wb")
writer = csv.writer(out_file)
row = 1
for row in reader:
    row[13] = handle_color(row[10])[1].replace(" - ","").strip()
    row[10] = handle_color(row[10])[0].replace("-","").replace("(","").replace(")","").strip()
    row[14] = handle_gb(row[10])[1].replace("-","").replace(" ","").replace("GB","").strip()
    row[10] = handle_gb(row[10])[0].strip()
    row[9] = handle_oem(row[10])[1].replace("Blackberry","RIM").replace("TMobile","T-Mobile").strip()
    row[15] = handle_addon(row[10])[1].strip()
    row[10] = handle_addon(row[10])[0].replace(" by","").replace("FREE","").strip()
    writer.writerow(row)
in_file.close()    
out_file.close()

I tried to solve this problem by initializing row variable to 1 but it didn't work.

Please help me in solving this issue.


Solution

  • Your reader variable is an iterable, by looping over it you retrieve the rows.

    To make it skip one item before your loop, simply call next(reader, None) and ignore the return value.

    You can also simplify your code a little; use the opened files as context managers to have them closed automatically:

    with open("tmob_notcleaned.csv", "rb") as infile, open("tmob_cleaned.csv", "wb") as outfile:
       reader = csv.reader(infile)
       next(reader, None)  # skip the headers
       writer = csv.writer(outfile)
       for row in reader:
           # process each row
           writer.writerow(row)
    
    # no need to close, the files are closed automatically when you get to this point.
    

    If you wanted to write the header to the output file unprocessed, that's easy too, pass the output of next() to writer.writerow():

    headers = next(reader, None)  # returns the headers or `None` if the input is empty
    if headers:
        writer.writerow(headers)