Search code examples
regexpython-3.xpandasdataframestartswith

Python 3 Pandas Select Dataframe using Startswith + or


Looking for the correct syntax to do a str.startswith but I want more than one condition.

The working code I have only returns offices that start with the letter "N":

new_df = df[df['Office'].str.startswith("N", na=False)]

Seeking a code that returns offices that can start with the letters "N","M","V",or "R". The following doesn't seem to work:

    new_df = df[df['Office'].str.startswith("N|M|V|R", na=False)]

What am I missing? Thanks!


Solution

  • Try this:

    df[df['Office'].str.contains("^(?:N|M|V|R)")]
    

    or:

    df[df['Office'].str.contains("^[NMVR]+")]
    

    Demo:

    In [91]: df
    Out[91]:
            Office
    0        No-No
    1         AAAA
    2    MicroHard
    3       Valley
    4        vvvvv
    5   zzzzzzzzzz
    6  Risk is fun
    
    In [92]: df[df['Office'].str.contains("^(?:N|M|V|R)")]
    Out[92]:
            Office
    0        No-No
    2    MicroHard
    3       Valley
    6  Risk is fun
    
    In [93]: df[df['Office'].str.contains("^[NMVR]+")]
    Out[93]:
            Office
    0        No-No
    2    MicroHard
    3       Valley
    6  Risk is fun