I have a pandas dataframe with one column like this:
Merged_Cities |
---|
New York, Wisconsin, Atlanta |
Tokyo, Kyoto, Suzuki |
Paris, Bordeaux, Lyon |
Mumbai, Delhi, Bangalore |
London, Manchester, Bermingham |
And I want a new dataframe with the output like this:
Merged_Cities | Cities |
---|---|
New York, Wisconsin, Atlanta | New York |
New York, Wisconsin, Atlanta | Wisconsin |
New York, Wisconsin, Atlanta | Atlanta |
Tokyo, Kyoto, Suzuki | Tokyo |
Tokyo, Kyoto, Suzuki | Kyoto |
Tokyo, Kyoto, Suzuki | Suzuki |
Paris, Bordeaux, Lyon | Paris |
Paris, Bordeaux, Lyon | Bordeaux |
Paris, Bordeaux, Lyon | Lyon |
Mumbai, Delhi, Bangalore | Mumbai |
Mumbai, Delhi, Bangalore | Delhi |
Mumbai, Delhi, Bangalore | Bangalore |
London, Manchester, Bermingham | London |
London, Manchester, Bermingham | Manchester |
London, Manchester, Bermingham | Bermingham |
In short I want to split all the cities into different rows while maintaining the 'Merged_Cities' column.
Here's a replicable version of df
:
df = pd.DataFrame({'Merged_Cities':['New York, Wisconsin, Atlanta',
'Tokyo, Kyoto, Suzuki',
'Paris, Bordeaux, Lyon',
'Mumbai, Delhi, Bangalore',
'London, Manchester, Bermingham']})
Use .str.split()
and .explode()
:
df = df.assign(Cities=df["Merged_Cities"].str.split(", ")).explode("Cities")
print(df)
Prints:
Merged_Cities Cities
0 New York, Wisconsin, Atlanta New York
0 New York, Wisconsin, Atlanta Wisconsin
0 New York, Wisconsin, Atlanta Atlanta
1 Tokyo, Kyoto, Suzuki Tokyo
1 Tokyo, Kyoto, Suzuki Kyoto
1 Tokyo, Kyoto, Suzuki Suzuki
2 Paris, Bordeaux, Lyon Paris
2 Paris, Bordeaux, Lyon Bordeaux
2 Paris, Bordeaux, Lyon Lyon
3 Mumbai, Delhi, Bangalore Mumbai
3 Mumbai, Delhi, Bangalore Delhi
3 Mumbai, Delhi, Bangalore Bangalore
4 London, Manchester, Bermingham London
4 London, Manchester, Bermingham Manchester
4 London, Manchester, Bermingham Bermingham