Search code examples
vbscriptasp-classic

add unique anchors to regexp matches in text


I use vbs regexp to highlight search matches in a text, and try to figure out how to combine the highlight spam with unique id anchor to for hyperlinking from a list of the matches.

I have tried a for IncrementCount = 1 to match.count loop, but that return the total on all matches.

Function HighlightText(strInput,arrtext)
    Set re = New RegExp 
    re.Pattern="(?!<.*?)(" & arrtext & ")(?![^<>]*?>)" 
    re.IgnoreCase = True 
    re.Global = True
         if arrtext <> "" then 
         strOutput = re.Replace(strInput,"<span id="""&IncrementCount&""" style=""background-color:#FFD200;"">"&""&"$&</span>")
         Else
         strOutput = strInput
         end if
    HighlightText = strOutput
    set re = nothing
end function

Solution

  • As far as I know you cannot get hold of a match number when using RegExp.Replace in vbScript. So, you have to use the RegExp.Execute method to get a collection of matches, then process each match yourself.

    Here is some example code that I think does what you want. It uses regexp to give the match collection that identifies the places in the input string that need to be replaced, then uses standard vbScript functions Left() and Mid() to chop the string at right places and insert the required spans. Note that it processes the matches in reverse order, so that the chop/insert can use the indexes of the original match without worrying about what has already been replaced in the output string.

    Also, note you might need to pre-process the match string if this is user-input, because the user may key in characters that are significant in a regular expression.

    This is certainly not as efficient as using RegExp.Replace but it does provide the relevant unique IDs for each match.

    '// Function finds all occurrences of a string 
    '// (or multiple strings separated by |) within 
    '// an input string, and places a <span> tag
    '// with a unique numbered id around each occurrence
    Function HighlightText(strInput,sFind)
    
        '// Don't do anything of no string to be found
        If len(sFind) = 0 then
            HighlightText = strInput
            Exit Function
        end If
    
        '// Define regexp
        Dim re
        Set re = New RegExp 
    
        '// Pattern to be found
        re.Pattern="(?!<.*?)(" & sFind & ")(?![^<>]*?>)" 
        re.IgnoreCase = True 
        re.Global = True
    
        '// find all the matches >> match collection
        Dim oMatches: Set oMatches = re.Execute( strInput )
    
        '// Prepare to process
        Dim sMatchedText
        Dim strOutput
        Dim oMatch
        Dim ix
    
        '// Initialize the output
        strOutput = strInput
    
        '// Process each match starting at the last one
        '// - this avoids needing to recalculate indexes
        '// after each replacement
        For ix = oMatches.Count -1 to 0 Step -1
    
            '// get the match
            Set oMatch = oMatches(ix)
    
            '// get the text that was matched
            sMatchedText = oMatch.SubMatches(0)
    
            '//DEBUG -- CONS.print "Match #" & ix & ": " & oMatch.Value & " At:" & cStr(oMatch.FirstIndex + 1)  & " Match:" & sMatchedText
    
            '// Replace the match in the output with the 
            '// required span tag, with unique match number as ID
            strOutput = Left(strOutput, oMatch.FirstIndex) & _
                        "<span id='MATCH#" & ix & "'>" & sMatchedText & "</span>" _
                        & Mid(strOutput, oMatch.FirstIndex + oMatch.length + 1)
        Next
    
        Set re = nothing
    
        '// Return the processed output
        HighlightText = strOutput
    
    End Function