Search code examples
regexvbams-wordfindacronym

Find words with more than one capital letter in word/VBA


I have a piece of VBA code that uses Find to find all the acronyms in a document. It does this by searching for all words consisting of capital letters that are 2 or more characters long using...

<[A-Z]{2,}>

The problem with this is it doesn't pick up all the acronyms, such as CoP, W3C, DVDs and CD-ROM. It picks up hyphenated acronyms in two parts which are not ideal but allowable as the list is checked by a user. I can also pick up words that end with an "s" or other characters by not searching to the end of the word using...

<[A-Z]{2,}

But this doesn't count any non-upper case character as part of the word it finds.

Is there an expression that would allow me to search for words with two or more capital letters in any location and find the whole word?


Solution

  • I don't think it's possible to 'search for words with two or more capital letters in any location and find the whole word' except in combination with macro code. Since you're using a macro, anyway, here's an approach that worked for me using the following sample text

    CoP, this That and AnoTher thing W3C, DVDs and CD-ROM
    

    and this wildcard combination (note that the list separator in my Windows configuration is ;, for other regions a , may be required).

    <[A-Z][0-9A-Z\-a-z]{1;10}>
    

    The following function checks whether the second or any later letter in the "found" range is capitalized and returns a boolean to the calling procedure. It loops through the characters in the given Range, checking the ASCII value. As soon as one is found, the loop exits.

    Function ContainsMoreThanOneUpperCase(rng As Word.Range) As Boolean
        Dim nrChars As Long, i As Long
        Dim char As String
        Dim HasUpperCase
    
        HasUpperCase = False
        nrChars = rng.Characters.Count
        For i = 2 To nrChars
            char = rng.Characters(i).text
            If Asc(char) >= 65 And Asc(char) <= 90 Then
                'It's an uppercase letter
                HasUpperCase = True
                Exit For
            End If
        Next
        ContainsMoreThanOneUpperCase = HasUpperCase
    End Function
    

    An example for using it:

    Sub FindAcronyms()
        Dim rngFind As Word.Range
        Dim bFound As Boolean
    
        Set rngFind = ActiveDocument.content
        With rngFind.Find
            .text = "<[A-Z][0-9A-Z\-a-z]{1;10}>"
            .MatchWildcards = True
            .Forward = True
            .wrap = wdFindStop
            bFound = .Execute
            Do While bFound
                If bFound And ContainsMoreThanOneUpperCase(rngFind) Then
                    Debug.Print rngFind.text
                    rngFind.HighlightColorIndex = wdBrightGreen
                End If
                rngFind.Collapse wdCollapseEnd
                bFound = .Execute
            Loop
        End With
    End Sub