Search code examples
pythonpandassplit

split a column in multiple columns, but only keep two of them


I have the following dataframe:

import pandas as pd
df = pd.DataFrame({'AB': ['A1_B1_C1', 'A2_B2_C2']})

I want to split the column AB on the first two occurrences of delimiter _, but only keep the first two columns. In other words, the output has to be

         AB   A   B
0  A1_B1_C1  A1  B1
1  A2_B2_C2  A2  B2

Currently, I can do it with

df[['A','B', 'C']]=df['AB'].str.split('_',n=2,expand=True)
df = df.drop(columns='C')

But it seems wasteful. Any options where I don't need to create a column that I then have to drop?


Solution

  • Use iloc to select the first two columns of the output:

    df[['A', 'B']] = df['AB'].str.split('_',n=2,expand=True).iloc[:, :2]
    

    Output:

             AB   A   B
    0  A1_B1_C1  A1  B1
    1  A2_B2_C2  A2  B2