i hope my question makes sense. i am looking for a way to read a csv file, and map a dictionary to each cell. i can make it work without csv, but i am having a hard time making it work when reading a csv file.
note:
string0 would be cell A1 or row[0]
string1 would be cell B1 or row[1]
string2 would be cell C1 or row[2]
this is what i have so far:
dict0 = {'A':'CODE1', 'B':'CODE2'}
text0 = []
string0 = 'A'
dict1 = {'avenue':'ave', 'street':'st', 'road':'rd', 'court':'ct'}
text1 = []
string1 = '123 MAIN AVENUE'
dict2 = {'(':'', ')':'', '-':'', ' ':'', '/':'', '\\':''}
text2 = []
string2 = '(123) 456/7890'
for i in string0:
newcode = dict0.get(i,i)
text0.append(newcode)
print ' '.join(text0)
for i in string1.lower().split(' '):
newaddress = dict1.get(i.lower(),i)
text1.append(newaddress)
print ' '.join(text1)
for i in string2:
newphone = dict2.get(i,i)
text2.append(newphone)
print ''.join(text2)
the code above works exactly as i intend it to work, but im having a hard time trying to make it work when reading a csv file.
thank you very much
edit #1:***********************************************
here is an excerpt of sample1.csv:
A,123 MAIN STREET,(123) 456-7890
B,888 TEST ROAD,(222) 555-5555
sorry if the code isnt much cleaner/clearer, but that is why i am asking for guidance.
in essence, every column will have a dictionary associated with it, so that the "code" column will write "CODE1 or CODE2" depending on the value of cell A1 ("A" or "B").
column 2 will have dict1{} associated with it, and will clean up the address column. column 3 will have dict2{} associated with it, and will remove (,),/,\ from the phone number column.
my issue is i do not know how to start the code. i can write the code if i set the cell information as variables (see me code above, variables: string0, string1, string2), but i do not know how i would start to iterate the csv file.
thank you
edit #2:***********************************************
here is my code when i try using import csv
dict0 = {'A':'CODE1', 'B':'CODE2'}
text0 = []
dict1 = {'avenue':'ave', 'street':'st', 'road':'rd', 'court':'ct'}
text1 = []
dict2 = {'(':'', ')':'', '-':'', ' ':'', '/':'', '\\':''}
text2 = []
import csv
with open('O:/sample1.csv', 'rb') as c:
reader = csv.reader(c)
for row in reader:
for i in row[0]:
newcode = dict0.get(i,i)
text0.append(newcode)
for i in row[1].lower().split(' '):
newaddress = dict1.get(i.lower(),i)
text1.append(newaddress)
for i in row[2]:
newphone = dict2.get(i,i)
text2.append(newphone)
print str(' '.join(text0)) + ',' + str(' '.join(text1)) + ',' + str(''.join(text2))
prints:
CODE1,123 main st,1234567890
CODE1 CODE2,123 main st 888 test rd,12345678902225555555
i would like to print:
CODE1,123 main st,1234567890
CODE2,888 test rd,2225555555
hopefully someone can help
thank you
edit #3 *********************************************************************************************************************
can the following be improved (syntax, indentation etc..):
sample1.csv:
A,123 MAIN STREET,(123) 456-7890
B,888 TEST ROAD,(222) 555-5555
here is the code:
import csv
newcsv = csv.writer(open('O:/csvfile1.csv', 'ab'))
with open('O:/sample1.csv', 'rb') as c:
reader = csv.reader(c)
dict0 = {'A':'CODE1', 'B':'CODE2'}
dict1 = {'avenue':'ave', 'street':'st', 'road':'rd', 'court':'ct'}
dict2 = {'(':'', ')':'', '-':'', ' ':'', '/':'', '\\':''}
# read element in *reader*
for row in reader:
text0 = []
text1 = []
text2 = []
newline = []
# read element in *row*
for i in row[0]:
newcode = dict0.get(i,i)
text0.append(newcode)
newline.append(' '.join(text0))
for i in row[1].lower().split(' '):
newaddress = dict1.get(i.lower(),i)
text1.append(newaddress)
newline.append(' '.join(text1))
for i in row[2]:
newphone = dict2.get(i,i)
text2.append(newphone)
newline.append(''.join(text2))
newcsv.writerow(newline)
print newline
prints the following:
['CODE1', '123 main st', '1234567890']
['CODE2', '888 test rd', '2225555555']
creates csvfile1.csv (using '|' as a 'cell delimiter') and its exactly what i want:
CODE1|123 main st|1234567890
CODE2|888 test rd|2225555555
just wondering if the above code can be improved/written in an more effective way.
thank you
The reason for the garbled output is that you are not clearing the text<n>
variables on each cycle of the loop. While there is the fix below, I recommend reading at least how to define functions, rewriting the code without so many global variables so that you don't run into the same problems like now.
with open('O:/sample1.csv', 'rb') as c:
reader = csv.reader(c)
for row in reader:
text0 = []
text1 = []
text2 = []
for i in row[0]:
newcode = dict0.get(i,i)
text0.append(newcode)
for i in row[1].lower().split(' '):
newaddress = dict1.get(i.lower(),i)
text1.append(newaddress)
for i in row[2]:
newphone = dict2.get(i,i)
text2.append(newphone)
print str(' '.join(text0)) + ',' + str(' '.join(text1)) + ',' + str(''.join(text2))