Search code examples
pythoncsvdictionarydbfarcpy

Updating a csv with python


I am very new to python and am working on a project to automate a mapping process for several thousand shape files. My eventual goal is to get each row in my csv into a separate .dbf dbase table. The information in the last column of my csv is not properly formatted. I need to take the data that looks like this: 'R12D9L12U9' (these are distance and direction calls) and have it converted to something like this: 'N 90-0-0 E, 12, S 0-0-0 E, 9, N 90-0-0 W, 12, N 0-0-0 E, 9', in order for the .dbf files to be mapped.The distances can range from 1 to about 200. I have a script that uses a dictionary to change the letters to their respective direction calls. My current issue is that when I try to add the commas after the distances (the numbers) using the dictionary, the operation cannot tell the difference between single, double, and triple digit numbers. For example, instead of getting '12,' I will get '1,2,' and in some cases '1,,2,'. I need a smarter way to read through the csv, possible using a list, but the csv module help files from python only mention using a dictionary these types of operations.

Here is the portion of my code that deals with this operation:

#Importing OpenPyXl and loads the workbook and sheet.
import openpyxl, csv, xlrd, glob, pyexcel, arcpy, datetime, time
wb = openpyxl.load_workbook('TESTVECT.xlsx')
ws = wb.get_sheet_by_name('TEST')


print "{o,o}" + '\tStart @ ' + datetime.datetime.fromtimestamp(time.time()).strftime('%H:%M:%S')
print "/)__)"
print ' " " '

#allows to save more than once.
write_only = False

#Sets a variable for all rows in a sheet.
maxRow = ws.max_row + 1

#xlsx to csv
wb1 = xlrd.open_workbook('TESTVECT.xlsx')
sh1 = wb1.sheet_by_name('TEST')
your_csv_file = open('TESTVECT.csv', 'wb')
wr = csv.writer(your_csv_file, quoting=csv.QUOTE_ALL)
for rownum in xrange(sh1.nrows):
    wr.writerow(sh1.row_values(rownum))
your_csv_file.close()

#Creates a dictionary for the direction conversion.
directionUpdates = {'1':'1,','2':'2,','3':'3,','4':'4,','5':'5,','6':'6,','7':'7,','8':'8,','9':'9,','10':'10,','11':'11,','12':'12,','13':'13,','14':'14,','15':'15,','16':'16,','17':'17,','18':'18,','19':'19,','20':'20,','21':'21,','22':'22,','23':'23,','24':'24,','25':'25,',
                    '26':'26,','27':'27,','28':'28,','29':'29,','30':'30,','31':'31,','32':'32,','33':'33,','34':'34,','35':'35,','36':'36,','37':'37,','38':'38,','39':'39,','40':'40,','41':'41,','42':'42,','43':'43,','44':'44,','45':'45,','46':'46,','47':'47,','48':'48,','49':'49,','50':'50,',
                    '51':'51,','52':'52,','53':'53,','54':'54,','55':'55,','56':'56,','57':'57,','58':'58,','59':'59,','60':'60,','61':'61,','62':'62,','63':'63,','64':'64,','65':'65,','66':'66,','67':'67,','68':'68,','69':'69,','70':'70,','71':'71,','72':'72,','73':'73,','74':'74,','75':'75,','76':'76,','77':'77,','78':'78,','79':'79,','80':'80,','81':'81,','82':'82,','83':'83,','84':'84,','85':'85,','86':'86,','87':'87,','88':'88,','89':'89,','90':'90,','91':'91,','92':'92,','93':'93,','94':'94,','95':'95,','96':'96,','97':'97,','98':'98,','99':'99,','100':'100,',
                    '101':'101,','102':'102,','103':'103,','104':'104,','105':'105,','106':'106,','107':'107,','108':'108,','109':'109,','110':'110,','111':'111,','112':'112,','113':'113,','114':'114,','115':'115,','116':'116,','117':'117,','118':'118,','119':'119,','120':'120,','121':'121,','122':'122,','123':'123,','124':'124,','125':'125,','126':'126,','127':'127,','128':'128,','129':'129,','130':'130,','131':'131,','132':'132,','133':'133,','134':'134,','135':'135,','136':'136,','137':'137,','138':'138,','139':'139,','140':'140,','141':'141,','142':'142,','143':'143,','144':'144,','145':'145,','146':'146,','147':'147,','148':'148,','149':'149,','150':'150,'}

#Defines the replacement operation as the 'replace_all' function that can be used later.
def replace_all(text, dic):
    for i, j in directionUpdates.items():
        text = text.replace(i, j)
    return text
#This is the starting .csv file that will be corrected for directions
with open('TESTVECT.csv','r') as f:
    text=f.read()
    text=replace_all(text,directionUpdates)
with open('TESTVECT.csv','w') as w:
    w.write(text)

directionUpdates = {'L': 'N 90-0-0 W,','R':'N 90-0-0 E,','U':'N 0-0-0 E,','D':'S 0-0-0 E,'}


def replace_all_2(text, dic):
    for i, j in directionUpdates.items():
        text = text.replace(i, j)
    return text
#This is the starting .csv file that will be corrected for directions
with open('TESTVECT.csv','r') as f:
    text=f.read()
    text=replace_all_2(text,directionUpdates)
with open('TESTVECT.csv','w') as w:
    w.write(text)

I know my dictionary looks very sloppy and there has to be a better way to make this operation possible.


Solution

  • Not entirely sure if this was your question. However here is how you can add coma after each digits. However this separates each digits from letters then joins them back together again appending comma in-between each items.

    import re
    
    inputD =  'R12D9L12U9'
    numbersSplit = re.split('(\d+)',inputD)
    addComa = ','.join(numbersSplit)
    print addComa
    

    Output result: R,12,D,9,L,12,U,9