Search code examples
pythonpandassplit

Split pandas dataframe column of type string into multiple columns based on number of ',' characters


Let's say I have a pandas dataframe that looks like this:

import pandas as pd
data = {'name': ['Tom, Jeffrey, Henry', 'Nick, James', 'Chris', 'David, Oscar']}
df = pd.DataFrame(data)
df
    name
0   Tom, Jeffrey, Henry
1   Nick, James
2   Chris
3   David, Oscar

I know I can split the names into separate columns using the comma as separator, like so:

df[["name1", "name2", "name3"]] = df["name"].str.split(", ", expand=True)
df
    name                name1   name2   name3
0   Tom, Jeffrey, Henry Tom     Jeffrey Henry
1   Nick, James         Nick    James   None
2   Chris               Chris   None    None
3   David, Oscar        David   Oscar   None

However, if the name column would have a row that contains 4 names, like below, the code above will yield a ValueError: Columns must be same length as key

data = {'name': ['Tom, Jeffrey, Henry', 'Nick, James', 'Chris', 'David, Oscar', 'Jim, Jones, William, Oliver']}
  
# Create DataFrame
df = pd.DataFrame(data)
df
    name
0   Tom, Jeffrey, Henry
1   Nick, James
2   Chris
3   David, Oscar
4   Jim, Jones, William, Oliver

How can automatically split the name column into n-number of separate columns based on the ',' separator? The desired output would be this:

        name                          name1  name2    name3   name4
0       Tom, Jeffrey, Henry           Tom    Jeffrey  Henry   None
1       Nick, James                   Nick   James    None    None
2       Chris                         Chris  None     None    None
3       David, Oscar                  David  Oscar    None    None
4       Jim, Jones, William, Oliver   Jim    Jones    William Oliver

Solution

  • Use DataFrame.join for new DataFrame with rename for new columns names:

    f = lambda x: f'name{x+1}'
    df = df.join(df["name"].str.split(", ", expand=True).rename(columns=f))
    print (df)
                              name  name1    name2    name3   name4
    0          Tom, Jeffrey, Henry    Tom  Jeffrey    Henry    None
    1                  Nick, James   Nick    James     None    None
    2                        Chris  Chris     None     None    None
    3                 David, Oscar  David    Oscar     None    None
    4  Jim, Jones, William, Oliver    Jim    Jones  William  Oliver