Search code examples
rregexstringindices

Get indices of all character elements matches in string in R


I want to get indices of all occurences of character elements in some word. Assume these character elements I look for are: l, e, a, z.

I tried the following regex in grep function and tens of its modifications, but I keep receiving not what I want.

grep("/([leazoscnz]{1})/", "ylaf", value = F)

gives me

numeric(0)

where I would like:

[1] 2 3 

Solution

  • To use grep work with individual characters of a string, you first need to split the string into separate character vectors. You can use strsplit for this:

    strsplit("ylaf", split="")[[1]]
    [1] "y" "l" "a" "f"
    

    Next you need to simplify your regular expression, and try the grep again:

    strsplit("ylaf", split="")[[1]]
    grep("[leazoscnz]", strsplit("ylaf", split="")[[1]])
    
    [1] 2 3
    

    But it is easier to use gregexpr:

    gregexpr("[leazoscnz]", "ylaf")
    [[1]]
    [1] 2 3
    attr(,"match.length")
    [1] 1 1
    attr(,"useBytes")
    [1] TRUE