Search code examples
pythonpandasdataframeexplode

Explode array [(str), (int)] in column dataframe pandas


I have a dataframe:

    df = pd.DataFrame({
       'day': ['11', '12'],
       'City': ['[(Mumbai, 1),(Bangalore, 2)]', '[(Pune, 3),(Mumbai, 4),(Delh, 5)]']
    })

   day                               City
0  11       [(Mumbai, 1),(Bangalore, 2)]
1  12  [(Pune, 3),(Mumbai, 4),(Delh, 5)]

I want to make an explode. But when I do that, nothing changes.

df2 = df.explode('City')

What I want to get at the output

  day            City
0  11     (Mumbai, 1)
1  11  (Bangalore, 2)
2  12       (Pune, 3)
3  12     (Mumbai, 4)
4  12       (Delh, 5)

Solution

  • You can explode strings. You need to find a way to convert to lists.

    Assuming you have city names with only letters (or spaces) you could use a regex to add the quotes and convert to list with ast.literal_eval:

    from ast import literal_eval
    
    df['City'] = (df['City']
                  .str.replace(r'([a-zA-Z ]+),', r'"\1",', regex=True)
                  .apply(literal_eval)
                  )
    
    df2 = df.explode('City', ignore_index=True)
    

    output:

      day            City
    0  11     (Mumbai, 1)
    1  11  (Bangalore, 2)
    2  12       (Pune, 3)
    3  12     (Mumbai, 4)
    4  12       (Delh, 5)