I have a dataframe with a 'timezone' column. Some of the entries are listed as 'country/city'. I want them to just be 'city'. There were similar questions on stackoverflow, from which I came up with the following.
df['timezone'] = df['timezone'].str.split('/').str[1]
However, this deleted the entries without a '/' in. So I tried various other adaptations but couldn't get any to work.
Next I tried to construct a lambda function and use map, doing various adaptions of below, this didn't work either.
df['timezone'] = df['timezone'].map(lambda x: x.split('/').str[1])
#AttributeError: 'list' object has no attribute 'str'
Finally, I decided to write a loop, below. Python took a while working through it, I was hopeful, but in the end nothing seemed to happen.
x = df['timezone']
for entry in x.items() :
if x.str.contains('/') is True:
x.str.split('/').str[1]
update(x)
else:
pass
Any help or advice much appreciated, thanks.
Restrict the number of splits to 1
(required when the delimiter could occur more than once), and then use str[-1]
instead of str[1]
:
df
timezone
0 country/city
1 foo
2 bar
df['timezone'] = df['timezone'].str.split('/', n=1).str[-1]
df
timezone
0 city
1 foo
2 bar
str[-1]
adequately handles those cases where there was nothing to split on.