Search code examples
pythonregexpandasdata-manipulationpython-re

Python search string contains characters


I have a data below:

col1      
086945159
549615853
589ac2546
GED456231
F56hy8W12

I want to find whether col has non-numeric value and return.

col1         col2 
086945159    086945159
549615853    549615853
589ac2546    Nan
GED456231    Nan
F56hy8W12    Nan
111111111    Nan
222222222    Nan

I used re.search(r'[^0-9]+', str) to find. However, how can I use this in apply() since if value in col has the same number, like 11111111 and 222222222, this should return Nan.


Solution

  • You can use mask with conditional pattern:

    # first part to match any non-digit
    # second part to match identical characters
    df['col2'] = df.col1.mask(df.col1.str.contains(r'\D|^(.)\1*$'))
    

    Output:

            col1       col2
    0  086945159  086945159
    1  549615853  549615853
    2  589ac2546        NaN
    3  GED456231        NaN
    4  F56hy8W12        NaN
    5  111111111        NaN
    6  222222222        NaN