Search code examples
python-3.xpandasstartswith

Pandas trim specific leading characters


Given the following data frame:

import pandas as pd
import numpy as np
df = pd.DataFrame({
       'A' : ['a', 'b','c', 'd'],
       'B' : ['and one', 'two','three', 'and four']
    })

df

    A   B
0   a   and one
1   b   two
2   c   three
3   d   and four

I'd like to trim off 'and ' from the beginning of any cell that starts with that part of the string. The desired result is as follows:

    A   B
0   a   one
1   b   two
2   c   three
3   d   four

Thanks in advance!


Solution

  • You could use a regex with str.replace:

    >>> df
       A          B
    0  a    and one
    1  b        two
    2  c  three and
    3  d   and four
    >>> df["B"] = df["B"].str.replace("^and ","")
    >>> df
       A          B
    0  a        one
    1  b        two
    2  c  three and
    3  d       four
    

    (Note that I put an "and" at the end of row 2 to show it wouldn't be changed.)