Search code examples
pythonpandasdataframereplacepandas-loc

pandas/python: combining replace and loc for replacing part of column names within a range


is it possible to use loc and replace functions to replace part of the column name for a range of columns? I've tried combining the replace and loc functions in a couple of variation, however, was unsuccessful. or are there any alternatives that changes part of the column names in a range of columns.

df.columns = df.columns.str.replace('j','rep',regex=True)
df.loc[:, 10:]

many thanks, regards


Solution

  • Consider the dataframe with the following columns

    >>> df.columns
    Index(['foo', 'bar', 'baz', 'twobaz', 'threebaz'], dtype='object', name='col')
    

    Now, suppose you wish to replace the string baz with the string BAZ in only the last two columns, in order to do that one possible approach would be to select the last two columns then replace the string in those columns and combine them back with the remaining columns

    df.columns = [*df.columns[:3], *df.columns[3:].str.replace('baz', 'BAZ', regex=True)]
    
    >>> df.columns
    Index(['foo', 'bar', 'baz', 'twoBAZ', 'threeBAZ'], dtype='object')
    

    Another possible approach would be to use the rename method of the dataframe, the benefit of using the rename method is that it preserves the index name (if any)

    c = df.columns[3:]
    df = df.rename(columns=dict(zip(c, c.str.replace('baz', 'BAZ', regex=True))))
    
    >>> df.columns
    Index(['foo', 'bar', 'baz', 'twoBAZ', 'threeBAZ'], dtype='object', name='col')