Search code examples
pythonpandasdataframesubstringcontains

How to find exact sub-string in pandas?


I'm trying to find the sub-string(taking from one data frame) from main-string(from main data frame), but I didn't get the desired result. The following are file details and output.

First data frame

handleid
49483
51466
83821
94159
105068

I want to search 49483 from the main data frame (id column). The result as follows.

id                collection_id     dc_language_iso
dli_ndli/49483    NaN               English
dli_ndli/494830   NaN               Kannada
dli_ndli/494831   NaN               Kannada
dli_ndli/494832   NaN               Kannada 

Above results shows that I am getting 4983, 49830, 49831, 49832. But I only want first row i.e dli_ndli/49483 NaN English. I don't want the rows with 49830, 49831, 49832 values as substring.

I am using contains functions available in pandas.


Solution

  • This should work:

     newdf[newdf['id'].str.contains('49483$', regex=True)] 
    
    #Out[216]: 
    #               id  collection_id dc_language_iso
    #0  dli_ndli/49483            NaN         English