Search code examples
pythonpandasdataframesubstringnan

How do I get variable length slices of values using Pandas?


I have data that includes a full name and first name, and I need to make a new column with the last name. I can assume full - first = last.

I've been trying to use slice with an index the length of the first name + 1. But that index is a series, not an integer. So it's returning NaN.

The commented lines show the things I tried. It took me a while to realize what the series/integer issue was. It seems this shouldn't be so difficult.

Thanks

import pandas as pd

columns = ['Full', 'First']
data = [('Joe Smith', 'Joe'), ('Bobby Sue Ford', 'Bobby Sue'), ('Current Resident', 'Current Resident'), ('', '')]
df = pd.DataFrame(data, columns=columns)

#first_chars = df['First'].str.len() + 1

#last = df['Full'].str[4:]
#last = df['Full'].str[first_chars:]
#last = df['Full'].str.slice(first_chars)
#last = df.Full.str[first_chars:]
#pd.DataFrame.insert(df, loc=2, column='Last', value=last)

#df['Last'] = df.Full.str[first_chars:]
#df['Last'] = str(df.Full.str[first_chars:])

#first_chars = int(first_chars)
#df['Last'] = df['Full'].apply(str).apply(lambda x: x[first_chars:])
df['Last'] = df['Full'].str.slice(df['First'].str.len() + 1)

print(df)

Solution

  • Edit: Use removeprefix instead of replace to deal with cases where first and last names are the same:

    df['Last'] = df.apply(lambda row: row['Full'].removeprefix(row['First']).strip(), axis=1)
    
                   Full             First   Last
    0         Joe Smith               Joe  Smith
    1    Bobby Sue Ford         Bobby Sue   Ford
    2  Current Resident  Current Resident       
    3                                           
    4           Joe Joe               Joe    Joe
    

    Original answer: Use apply on axis=1 to replace each name:

    df['Last'] = df.apply(lambda row: row['Full'].replace(row['First'], '').strip(), axis=1)
    
                   Full             First   Last
    0         Joe Smith               Joe  Smith
    1    Bobby Sue Ford         Bobby Sue   Ford
    2  Current Resident  Current Resident       
    3