Search code examples
pythonpandastuplesmultiple-columnsseries

Split Pandas Series Tuple on the fly to Pandas Column


I want to split the Pandas Series tuple on the fly in to multiple columns. Generate dummy data using code below:

df = pd.DataFrame(data={'a':[1, 2, 3, 4, 5, 6]})
df['b'] = pd.Series([(np.nan, 1), ('AB', 10), ('CD', 1), (3, 1), (4, 1), ('NA', 1)]).str[0]
df['c'] = pd.Series([(np.nan, 1), ('AB', 10), ('CD', 1), (3, 1), (4, 1), ('NA', 1)]).str[1]

How can I create column b and c in 1 line of code?

I've tried below code but it doesn't split as per the requirement.

df[['b', 'c']] = pd.Series([(np.nan, 1), ('AB', 10), ('CD', 1), (3, 1), (4, 1), ('NA', 1)]).str[0:2]

Solution

  • Assign both str to 2 columns:

    df = pd.DataFrame(data={'a':[1, 2, 3, 4, 5, 6]})
    
    s = pd.Series([(np.nan, 1), ('AB', 10), ('CD', 1), (3, 1), (4, 1), ('NA', 1)])
    
    df['b'], df['c'] = s.str[0], s.str[1]
    

    Or create 2 columns DataFrame:

    s = pd.Series([(np.nan, 1), ('AB', 10), ('CD', 1), (3, 1), (4, 1), ('NA', 1)])
    
    df[['b', 'c']] = pd.DataFrame(s.tolist(), index=df.index)
    print(df)
       a    b   c
    0  1  NaN   1
    1  2   AB  10
    2  3   CD   1
    3  4    3   1
    4  5    4   1
    5  6   NA   1
    

    What is same like one lines code:

    df['b'], df['c'] = pd.Series([(np.nan, 1), ('AB', 10), ('CD', 1), (3, 1), (4, 1), ('NA', 1)]).str[0], pd.Series([(np.nan, 1), ('AB', 10), ('CD', 1), (3, 1), (4, 1), ('NA', 1)]).str[1]
    df[['b', 'c']] = pd.DataFrame(pd.Series([(np.nan, 1), ('AB', 10), ('CD', 1), (3, 1), (4, 1), ('NA', 1)]).tolist(), index=df.index)