Search code examples
pythonregexpandascontains

Why contains can't select rows contains specified string?


>>> y
1    2002-12-31
2    2003-12-31
3    2004-03-31
4    2004-06-30
Name: report_date, dtype: object

I want to extract rows which contain 12-31 .

>>> y.str.contains('12-31')
>>> y.str.contains('\.+12-31')
>>> y.str.contains('2002-12-31')

All the three expressions get same output:

1   NaN
2   NaN
3   NaN
4   NaN
Name: report_date, dtype: float64

How can i extract rows which contain string 12-31? My desired output:

1   True
2   True
3   NaN
4   NaN

Solution

  • Maybe in column are dates, so convert it to strings before:

    m = y.astype(str).str.contains('12-31')
    print (m)
    0     True
    1     True
    2    False
    3    False
    Name: report_date, dtype: bool