Search code examples
pythonstringpandassubstringstring-length

In Pandas, how do I check if three combined string columns == 10 characters, and if so, insert into new column?


I'm wanting to combine three Pandas string columns into a new column, if the new column's total combined characters equal 10.

If not equal to 10, then check next combined column.

I've tried adding the columns together into a new column if Phone1Area equals 3 string characters, and if Phone1Prefix equals 3 string characters, and if Phone1NumberPart equals 4 string characters, in other words ten total characters. I've tried adding columns if they are 3+3+4 characters, df.loc, and many more things.

Here is a sample of the data set:

Dataset

Here is the code:

dfp['p1'] = df[(df['Phone1Area'].str.len() == 3.0)]['Phone1Area'] + 
df[(df['Phone1Exchange'].str.len() == 3.0)]['Phone1Exchange'] + 
df[(df['Phone1NumberPart'].str.len() == 4.0)]['Phone1NumberPart']


dfp['p2'] = df[(df['Phone2Area'].str.len() == 3.0)]['Phone2Area'] + 
df[(df['Phone2Exchange'].str.len() == 3.0)]['Phone2Exchange'] + 
df[(df['Phone2NumberPart'].str.len() == 4.0)]['Phone2NumberPart']

df_phone.loc[df_phone['p1'].str.len() == 10, 'phone'] = df_phone['p1']
df_phone.loc[df_phone['p2'].str.len() == 10, 'phone'] = df_phone['p2']

Here is what I want it to do, but in Pandas:

if df_phone['p1'].str.len() == 10:
    then insert df_phone['p1'] into df_phone['phone']
elif df_phone['p2'].str.len() == 10:
    then insert df_phone['p2'] into df_phone['phone']
elif df_phone['p3'].str.len() == 10:
    then insert df_phone['p3'] into df_phone['phone']

I expected the phone column to have the 10 characters from phone 1, and if that wasn't 10 characters, then the phone column to have the 10 characters from phone 2, etc.

But one of the results was:

AttributeError: 'DataFrame' object has no attribute 'str'

Any idea how to fix this?


Solution

  • This should help:

    df['phone'] = ''
    df['test_phone'] = df['phone1Area'] + df['phone1Exchange'] + df['phone1NumberPart']
    df['phone'][df['test_phone'].str.len() == 10] = df['test_phone']
    df['test_phone'] = df['phone2Area'] + df['phone2Exchange'] + df['phone2NumberPart']
    df['phone'][(df['test_phone'].str.len() == 10) & (df['phone'] == '')] = df['test_phone']
    df['test_phone'] = df['phone3Area'] + df['phone3Exchange'] + df['phone3NumberPart']
    df['phone'][(df['test_phone'].str.len() == 10) & (df['phone'] == '')] = df['test_phone']
    etc.