Search code examples
pythonpandasdataframedata-processing

Can't split the of one coulmn in a python dataframe into two


I am having a dataset as following.

df=
0    153.38 -27.99
1    151.21 -33.87
2    151.21 -33.87
3    153.05 -26.68
4    153.44 -28.06
Name: merchant_long_lat, dtype: object

When I implement the split code for spliting the lat-long into two different columns named lat and long on it i.e.:

df[['merch_long','merch_lat']]=df.str.split(expand=True)

It returned an error as:

ValueError                                Traceback (most recent call last)
/tmp/ipykernel_1821133/1587686441.py in <module>
----> 1 df[['merch_long','merch_lat']]=df.str.split(expand=True)

~/.local/lib/python3.8/site-packages/pandas/core/indexers.py in check_key_length(columns, key, value)
    426     if columns.is_unique:
    427         if len(value.columns) != len(key):
--> 428             raise ValueError("Columns must be same length as key")
    429     else:
    430         # Missing keys in columns are represented as -1

ValueError: Columns must be same length as key

Solution

  • You are trying to do this:

    df[['merch_long','merch_lat']]=df.merchant_long_lat.str.split(" -",expand=True)
    

    You have to specify which column you are trying to split and by what you are splitting. If you aren't specifying by which character/word you want to split then default is " ".

    For other information about pd.Series.str.split() you can visit official documentation.

    String or regular expression to split on. If not specified, split on whitespace.