Search code examples
pythonpandasdata-cleaning

Equal spacing in all the strings of the columns in python


I have more than 1.400.000 rows of data in an Address column.

3, clive row, calcutta
3 , clive row,calcutta
3,clive row , calcutta

Spacing between the strings are uneven and I want all of them in the same format as below:

3, clive row, calcutta

How can this be done?


Solution

  • One way is to replace any spaces before & after a comma and comma itself with ", ":

    df.Address = df.Address.str.replace(r"\s*,\s*", ", ", regex=True)
    

    another way is to split over possible spaces before & after comma and comma itself and then join with ", ":

    df.Address = df.Address.str.split(r"\s*,\s*").str.join(", ")
    

    to get

    >>> df.Address
    
    0    3, clive row, calcutta
    1    3, clive row, calcutta
    2    3, clive row, calcutta
    Name: Address, dtype: object