Search code examples
pythonpandasnumpysplitlatitude-longitude

How to split a column in Python into multiple columns using different delimiters


I want so split a column in a dataframe into multiple columns.

This is the dataframe (df) I have:

object_id    shape            geometry
   1         450    polygon((6.6 51.2, 6.69 51.23, 6.69 51.2))

The output I want looks like this:

x      y    x    y      x    y
6.6  51.2  6.69  51.23  6.69  51.2

I am using this code:

df.geometry.str.split('( , )',expand=True)

but I'm getting an error.


Solution

  • bit hacky, but we can do some string manipulation to re-create your dataframe.

    s = df['geometry'].str.replace('polygon\(\(||\)\)','')\
          .str.split(',',expand=True).stack()\
          .str.strip().str.split(' ').explode().to_frame('vals')
    
    
    s['cords'] =  s.groupby(level=[0,1]).cumcount().map({0 : 'x', 1 : 'y'})
    
    
    df.join(s.set_index('cords',append=True).unstack([1,2]).droplevel(level=[0,1],axis=1))
    

       object  shape                                    geometry    x     y     x      y     x     y
    0       1      2  polygon((6.6 51.2, 6.69 51.23, 6.69 51.2))  6.6  51.2  6.69  51.23  6.69  51.2