Search code examples
pythonparsingpython-2.7data-structuresstring-parsing

Parsing a string into a list of dicts


I have a string that looks like this:

POLYGON ((148210.445767647 172418.761192525, 148183.930888667 172366.054787545, 148183.866770629 172365.316772032, 148184.328078148 172364.737139913, 148220.543522168 172344.042601933, 148221.383518338 172343.971823159), (148221.97916844 172344.568316375, 148244.61381946 172406.651932395, 148244.578100039 172407.422441673, 148244.004662562 172407.938319453, 148211.669446582 172419.255646473, 148210.631989339 172419.018894911, 148210.445767647 172418.761192525))

I can easily strip POLYGON out of the string to focus on the numbers but I'm kinda wondering what would be the easiest/best way to parse this string into a list of dict.

The first parenthesis (right after POLYGON) indicates that multiple elements can be provided (separated by a comma ,).

So each pair of numbers is to supposed to be x and y.

I'd like to parse this string to end up with the following data structure (using python 2.7):

list [ //list of polygons
  list [ //polygon n°1
    dict { //polygon n°1's first point
      'x': 148210.445767647, //first number
      'y': 172418.761192525 //second number
    },
    dict { //polygon n°1's second point
      'x': 148183.930888667,
      'y': 148183.930888667
    },
    ... // rest of polygon n°1's points
  ], //end of polygon n°1
  list [ // polygon n°2
    dict { // polygon n°2's first point
      'x': 148221.9791684,
      'y': 172344.568316375
    },
    ... // rest of polygon n°2's points
  ] // end of polygon n°2
] // end of list of polygons

Polygons' number of points is virtually infinite.
Each point's numbers are separated by a blank.

Do you guys know a way to do this in a loop or any recursive way ?

PS: I'm kind of a python beginner (only a few months under my belt) so don't hesitate to explain in details. Thank you!


Solution

  • can you try?

    import ast
    
    POLYGON = '((148210.445767647 172418.761192525, 148183.930888667 172366.054787545, 148183.866770629 172365.316772032, 148184.328078148 172364.737139913, 148220.543522168 172344.042601933, 148221.383518338 172343.971823159), (148221.97916844 172344.568316375, 148244.61381946 172406.651932395, 148244.578100039 172407.422441673, 148244.004662562 172407.938319453, 148211.669446582 172419.255646473, 148210.631989339 172419.018894911, 148210.445767647 172418.761192525))'
    new_polygon = '(' + POLYGON.replace(', ', '),(').replace(' ', ',') + ')'
    
    
    data = ast.literal_eval(new_polygon)
    result_list = list()
    for items in data:
        sub_list = list()
        for item in items:
            sub_list.append({
                'x': item[0],
                'y': item[1]
            })
        result_list.append(sub_list)
    
    print result_list