Search code examples
python-3.xregexpandasdataframeextract

Alter column to store the split functioned string in Pandas Dataframe


I did the following:

df[['col1','col2']] = df.col.str.split(',', expand = True)

It works well for like, "a,b". If the col value is a string without a comma like "a", the above line takes that string to the col1. But if I want to take that in the second column that is, col2, is there any one-line way to do that?

Note that: I want to work it reversely only when it gets a string without a comma.

For more clarification: Sample dataframe column,

col: "a,b", "a"

Expected output,

row 1-> col1: a, col2: b

row 2-> col1: none, col2: a

Thanks in advance :))


Solution

  • You should rather use extract instead of split:

    df.col.str.extract(r'(\w*),?(\w+)')
    
       0  1
    0  a  b
    1     a
    

    where df = pd.DataFrame({'col':['a,b', 'a']})

    Note that the regex can change with respect to what is needed

    EDIT:

     df.col.str.extract(r'^([^,]*),?(\b\w+)')
    
            0         1
    0  Uttara     Dhaka
    1          Faridpur