Search code examples
pythonpython-3.xpandassplit

Moving Names in to First Name the Last Name


I have a imported a csv dataset into python that is being cleaned up, there is no consistency with names some being "John Doe" and others being "Doe, John". I need them to be "First name Last name" without the comma:

Doe, John  
Smith, John 
Snow, John
John Cena 
Steve Smith 

When I want:

 John Doe 
 John Smith 
 John Snow
 John Cena 
 Steve Smith

I tried doing:

if ',' in df['names']:
    df['names'] = ' '. join(df['names'].split(',')[::-1]).strip()

I get

AttributeError: 'Series' object has no attribute 'split'

I have tried making name into a list by doing prior to the code above but that didn't work:

df['name'] = df['name'].to_list()

Solution

  • You can use str.replace and use capture groups to swap values:

    df['names'] = df['names'].str.replace(r'([^,]+),\s*(.+)', r'\2 \1', regex=True)
    print(df)
    
    # Output
             names
    0     John Doe
    1   John Smith
    2    John Snow
    3    John Cena
    4  Steve Smith
    

    Note: you have to use str accessor in your code (but does not solve the next problem):

    # Replace
    df['names'].split(',')
    
    # With
    df['names'].str.split(',')