Search code examples
regexrubyposition

How to get indexes of all occurrences of a pattern in a string


string = "Jack and Jill went up the hill to fetch a pail of water. Jack fell down and broke his crown. And Jill came tumbling after. "
d = string.match(/(jack|jill)/i) # -> MatchData "Jill" 1:"Jill"
d.size # -> 1

This only match the first occurrence it seems.
string.scan does the job partially but it doesn't tell anything about the index of the matched pattern.

How do i get a list of all the matched instances of the pattern and their indices (positions)?


Solution

  • You can use .scan and $` global variable, which means The string to the left of the last successful match, but it doesn't work inside usual .scan, so you need this hack (stolen from this answer):

    string = "Jack and Jill went up the hill to fetch a pail of water. Jack fell down and broke his crown. And Jill came tumbling after. "  
    string.to_enum(:scan, /(jack|jill)/i).map do |m,|
      p [$`.size, m]
    end
    

    output:

    [0, "Jack"]
    [9, "Jill"]
    [57, "Jack"]
    [97, "Jill"]
    

    UPD:

    Note the behaviour of lookbehind – you get the index of the really matched part, not the look one:

    irb> "ab".to_enum(:scan, /ab/     ).map{ |m,| [$`.size, $~.begin(0), m] }
    => [[0, 0, "ab"]]
    irb> "ab".to_enum(:scan, /(?<=a)b/).map{ |m,| [$`.size, $~.begin(0), m] }
    => [[1, 1, "b"]]