I have a dataframe where some, but not all, columns start with weird prefixes. For example,
ot1/pr1/cp1/dkjsakhka
, gt2/pr1/fl1/ct4/rmt12/dasljdals
, etc. I want to rename all columns which have slashes in their names by removing all the characters before the last slash, e.g. ot1/pr1/cp1/dkjsakhka
-> dkjsakhka
, gt2/pr1/fl1/ct4/rmt12/dasljdals
-> dasljdals
, while leaving the other column names untouched. How can I do that?
You can use str.rsplit
:
df.columns = df.columns.str.rsplit('/', n=1).str[-1]
Output:
# before
>>> df.columns
Index(['untouched column', 'ot1/pr1/cp1/dkjsakhka',
'gt2/pr1/fl1/ct4/rmt12/dasljdals'],
dtype='object')
# after
>>> df.columns
Index(['untouched column', 'dkjsakhka', 'dasljdals'], dtype='object')
EDIT:
You can also use rename
if you prefer:
df = df.rename(lambda x: x.rsplit('/', maxsplit=1)[-1], axis=1)