I read a lot of SO topics about encoding to get an idea what's wrong with my code, but I'm still stucked.
I want to decode Google polylines which I have in csv file. I am using polyline library which works fine. The problem is that some lines from csv can't be processed when I read them from csv, but when I just pass that polyline as string literal it works okay. I presume it's some encoding issue, because polylines that make problem have two consecutive backslashes and/or backticks.
import csv
import polyline
INPUT_FILE = 'sample_input.csv'
csv.register_dialect(
'mydialect',
delimiter = ',',
quotechar = '"',
doublequote = True,
skipinitialspace = True,
quoting = csv.QUOTE_ALL)
with open(INPUT_FILE, 'r', encoding="utf-8") as csv_file:
read = csv.reader(csv_file, dialect='mydialect')
header = next(read, [])
for row in read:
site_id = row[0]
encoded_polyline = row[1]
print(site_id)
try:
decoded = polyline.decode(encoded_polyline)
print(decoded)
except:
print(encoded_polyline)
print()
Sample polyline is:
"dk`mEg}jx[STEFGJKRONUVSTkAtAiAlAsA~Ag@p@[^[`@e@p@KTSVU\\GHGNEHEHCFAFAFAFAPAP?N?B@T@V@R@F"
Please pay attention that here it also appears with only one backslash and no backtick - probably similar encoding issue?
Any help would be appreciated, especially explanation why behaviour with string literal is not the same as with string variable.
This should solve your problem
decoded = polyline.decode(encoded_polyline.replace('\\\\','\\'))