python csv data-manipulation data-munging

"Doing work" on csv DictReader fails

I am writing a script where I need to read a CSV into a DictReader, do some work on the fields (data munging), then output the DictReader to a csv via DictWriter.

If I read the CSV then write the Dict, the process works.

#Create the sample file
headers = ['Symbol', 'Price', 'Date', 'Time', 'Change', 'Volume']
rows = [{'Symbol':'AA', 'Price':39.48, 'Date':'6/11/2007',
          'Time':'9:36am', 'Change':-0.18, 'Volume':181800},
        {'Symbol':'AIG', 'Price': 71.38, 'Date':'6/11/2007',
          'Time':'9:36am', 'Change':-0.15, 'Volume': 195500},
        {'Symbol':'AXP', 'Price': 62.58, 'Date':'6/11/2007',
          'Time':'9:36am', 'Change':-0.46, 'Volume': 935000},
        ]

#Open sample file
with open('stocks.csv','w') as f:
    f_csv = csv.DictWriter(f, headers)
    f_csv.writeheader()
    f_csv.writerows(rows)

#Output the dict    
with open('stocks.csv', 'r') as file:
    csvread = csv.DictReader(file, delimiter=',')

    with open('out.csv', 'w') as out:
        headertowrite = ['Time', 'Symbol', 'NewColumn']
        writer = csv.DictWriter(out, headertowrite, extrasaction='ignore')
        writer.writeheader()
        writer.writerows(csvread)
#Works!

However - if I add a new column, it appears that I lose all the data in the DictReader:

headers = ['Symbol', 'Price', 'Date', 'Time', 'Change', 'Volume']
rows = [{'Symbol':'AA', 'Price':39.48, 'Date':'6/11/2007',
          'Time':'9:36am', 'Change':-0.18, 'Volume':181800},
        {'Symbol':'AIG', 'Price': 71.38, 'Date':'6/11/2007',
          'Time':'9:36am', 'Change':-0.15, 'Volume': 195500},
        {'Symbol':'AXP', 'Price': 62.58, 'Date':'6/11/2007',
          'Time':'9:36am', 'Change':-0.46, 'Volume': 935000},
        ]

with open('stocks.csv','w') as f:
    f_csv = csv.DictWriter(f, headers)
    f_csv.writeheader()
    f_csv.writerows(rows)

with open('stocks.csv', 'r') as file:
    csvread = csv.DictReader(file, delimiter=',')


    for row in csvread:
        row['NewColumn'] = '1'

    with open('out.csv', 'w') as out:
        headertowrite = ['Time', 'Symbol', 'NewColumn']
        writer = csv.DictWriter(out, headertowrite, extrasaction='ignore')
        writer.writeheader()
        writer.writerows(csvread)
#Out.csv is blank!

Is there a way I can perform work on the DictReader before I write it?

Solution

As csvread has been entirely consumed by your first for loop, our with statement will output a blank file, as you may notice.

You should either:

--> save rows into a variable

import csv
headers = ['Symbol', 'Price', 'Date', 'Time', 'Change', 'Volume']
rows = [{'Symbol':'AA', 'Price':39.48, 'Date':'6/11/2007',
         'Time':'9:36am', 'Change':-0.18, 'Volume':181800},
        {'Symbol':'AIG', 'Price': 71.38, 'Date':'6/11/2007',
         'Time':'9:36am', 'Change':-0.15, 'Volume': 195500},
        {'Symbol':'AXP', 'Price': 62.58, 'Date':'6/11/2007',
         'Time':'9:36am', 'Change':-0.46, 'Volume': 935000}]

with open('stocks.csv','w') as f:
    f_csv = csv.DictWriter(f, headers)
    f_csv.writeheader()
    f_csv.writerows(rows)

with open('stocks.csv', 'r') as file:
    csvread = csv.DictReader(file, delimiter=',')

    rows = []
    for row in csvread:
        row['NewColumn'] = '1'
        rows.append(row)

    with open('out.csv', 'w') as out:
        headertowrite = ['Time', 'Symbol', 'NewColumn']
        writer = csv.DictWriter(out, headertowrite, extrasaction='ignore')
        writer.writeheader()
        writer.writerows(rows)

--> make your modification within the with statement that will output the file (you read one line, modify it and write it.

import csv
headers = ['Symbol', 'Price', 'Date', 'Time', 'Change', 'Volume']
rows = [{'Symbol':'AA', 'Price':39.48, 'Date':'6/11/2007',
         'Time':'9:36am', 'Change':-0.18, 'Volume':181800},
        {'Symbol':'AIG', 'Price': 71.38, 'Date':'6/11/2007',
         'Time':'9:36am', 'Change':-0.15, 'Volume': 195500},
        {'Symbol':'AXP', 'Price': 62.58, 'Date':'6/11/2007',
         'Time':'9:36am', 'Change':-0.46, 'Volume': 935000}]

with open('stocks.csv','w') as f:
    f_csv = csv.DictWriter(f, headers)
    f_csv.writeheader()
    f_csv.writerows(rows)

with open('stocks.csv', 'r') as file:
    csvread = csv.DictReader(file, delimiter=',')

    with open('out.csv', 'w') as out:
        headertowrite = ['Time', 'Symbol', 'NewColumn']
        writer = csv.DictWriter(out, headertowrite, extrasaction='ignore')
        writer.writeheader()
        for row in csvread:
            row['NewColumn'] = '1'
            writer.writerow(row)

Take care! It is writerow not writerows in the second solution!

Note that I also highly recommend the second solution which is more scalable, in terms of memory consumption.