Search code examples
pythonpython-3.xcsvencodinggoogle-polyline

Polylines from csv - different behaviour than from string literal


I read a lot of SO topics about encoding to get an idea what's wrong with my code, but I'm still stucked.

I want to decode Google polylines which I have in csv file. I am using polyline library which works fine. The problem is that some lines from csv can't be processed when I read them from csv, but when I just pass that polyline as string literal it works okay. I presume it's some encoding issue, because polylines that make problem have two consecutive backslashes and/or backticks.

import csv
import polyline

INPUT_FILE = 'sample_input.csv'

csv.register_dialect(
    'mydialect',
    delimiter = ',',
    quotechar = '"',
    doublequote = True,
    skipinitialspace = True,
    quoting = csv.QUOTE_ALL)

with open(INPUT_FILE, 'r', encoding="utf-8") as csv_file:

    read = csv.reader(csv_file, dialect='mydialect')

    header = next(read, [])

    for row in read:

        site_id = row[0]
        encoded_polyline = row[1]

        print(site_id)

        try:
            decoded = polyline.decode(encoded_polyline)
            print(decoded)
        except:
            print(encoded_polyline)

        print()

Sample polyline is:

"dk`mEg}jx[STEFGJKRONUVSTkAtAiAlAsA~Ag@p@[^[`@e@p@KTSVU\\GHGNEHEHCFAFAFAFAPAP?N?B@T@V@R@F"

Please pay attention that here it also appears with only one backslash and no backtick - probably similar encoding issue?

Any help would be appreciated, especially explanation why behaviour with string literal is not the same as with string variable.


Solution

  • This should solve your problem

    decoded = polyline.decode(encoded_polyline.replace('\\\\','\\'))