Search code examples
pythonpandasdataframeseries

How to split column values into multiple columns pandas


I have a df

Name Zone                                                    Dummy
A    BARI (BA), BARLETTA (BT), BRINDISI (BR), FOGGIA (FG)     2
B    BARI (BA), FOGGIA (FG)                                   2
C    HDEF (SE), LECCE (LE)                                    3
D    GUVA (PP)                                                3

I need df as

    Name Zone                                             Symbol            Dummy
    A    BARI , BARLETTA , BRINDISI , FOGGIA         (BA),(BT),(BR),(FG)      2
    B    BARI , FOGGIA                               (FG),(BA)                2
    C    HDEF , LECCE                                (LE),(SE)                3
    D    GUVA                                        (PP)                     3

Tried to split the Zone and Symbol using

Series.str.split()

but not working as expected.


Solution

  • You could use str.replace here:

    df["Symbol"] = df["Zone"].str.replace(r'(?:|\s+)[A-Z]+\s+', '')
    df["Zone"] = df["Zone"].str.replace(r'\s*\(.*?\)\s*', '')