Search code examples
rregexword-boundary

How to use grep()/gsub() to find exact match


string = c("apple", "apples", "applez")
grep("apple", string)

This would give me the index for all three elements in string. But I want an exact match on the word "apple" (i.e I just want grep() to return index 1).


Solution

  • Use word boundary \b which matches a between a word and non-word character,

    string = c("apple", "apples", "applez")
    grep("\\bapple\\b", string)
    [1] 1
    

    OR

    Use anchors. ^ Asserts that we are at the start. $ Asserts that we are at the end.

    grep("^apple$", string)
    [1] 1
    

    You could store the regex inside a variable and then use it like below.

    pat <- "\\bapple\\b"
    grep(pat, string)
    [1] 1
    pat <- "^apple$"
    grep(pat, string)
    [1] 1
    

    Update:

    paste("^",pat,"$", sep="")
    [1] "^apple$"
    string
    [1] "apple"   "apple:s" "applez" 
    pat
    [1] "apple"
    grep(paste("^",pat,"$", sep=""), string)
    [1] 1