Search code examples
regexvbaemail-bounces

VBA RegEx scraping bounced emails in MS Outlook


I'm trying to expand the functionality of some Outlook email scrapping VBA code. I get bounce-back emails on a regular basis and would like to keep track of these (for deletion) by exporting said email address to MS Excel.

The code works, to a point. I can only scrape the first email address within a typical bounce-back notification email using RegEx. The mail servers for the company that I work for aggregate emails from the same domain into one notification email. Consequently, I get multiple notification emails that contain multiple bounced emails.

How do I get RegEx to cycle through the entire notification email to gather all email addresses??? I'm a little stuck right now because — admittedly — I don't know much about RegEx and "adopted" the majority of this code...

Thank you for your help Stackoverflow!!!

Sub Extract_Invalid_To_Excel()

Dim olApp As Outlook.Application
Dim olExp As Outlook.Explorer
Dim olFolder As Outlook.MAPIFolder
Dim obj As Object
Dim stremBody As String
Dim stremSubject As String
Dim i As Long
Dim x As Long
Dim count As Long
Dim RegEx As Object
Set RegEx = CreateObject("VBScript.RegExp")
Dim xlApp As Object 'Excel.Application
Dim xlwkbk As Object 'Excel.Workbook
Dim xlwksht As Object 'Excel.Worksheet
Dim xlRng As Object 'Excel.Range

Set olApp = Outlook.Application
Set olExp = olApp.ActiveExplorer

Set olFolder = olExp.CurrentFolder

'Open Excel
Set xlApp = GetExcelApp
xlApp.Visible = True
If xlApp Is Nothing Then GoTo ExitProc

Set xlwkbk = xlApp.Workbooks.Add
Set xlwksht = xlwkbk.Sheets(1)
Set xlRng = xlwksht.Range("A1")
xlRng.Value = "Bounced email addresses"

'Set count of email objects
count = olFolder.Items.count

'counter for excel sheet
i = 0
'counter for emails
x = 1

For Each obj In olFolder.Items '**Loops through selected Outlook folder**
    xlApp.StatusBar = x & " of " & count & " emails completed"
    stremBody = obj.Body
    stremSubject = obj.Subject

    If checkEmail(stremBody) = True Then '**Checks email for keywords in email
        'MsgBox ("finding email: " & stremBody)

        '**RegEx to find email addresses within message body
        With RegEx
            .Pattern = "\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b"
            .IgnoreCase = True
            .MultiLine = True
            .Global = False
            Set olMatches = .Execute(stremBody) 'Executes RegEx function

            'Loop through RegEx matches
            For Each match In olMatches
                xlwksht.Cells(i + 2, 1).Value = match
                i = i + 1
            Next match
        End With
        'TODO: move or mark the email that had the address extracted
    Else
        '**To view the items that aren't being parsed uncomment the following line
        'MsgBox (stremBody)
    End If

    x = x + 1
Next obj
xlApp.ScreenUpdating = True
MsgBox ("Invalid Email addresses are done being extracted")

ExitProc:
Set xlRng = Nothing
Set xlwksht = Nothing
Set xlwkbk = Nothing
Set xlApp = Nothing
Set emItm = Nothing
Set olFolder = Nothing
Set olNS = Nothing
Set olApp = Nothing
End Sub

Function GetExcelApp() As Object
' always create new instance
On Error Resume Next
Set GetExcelApp = CreateObject("Excel.Application")
On Error GoTo 0
End Function

Function checkEmail(ByVal Body As String) As Boolean
    Dim keywords(3) As String
    keywords(0) = "recipient's e-mail address was not found"
    keywords(1) = "error occurred while trying to deliver this message"
    keywords(2) = "message wasn't delivered"

    'Default value
    checkEmail = False
     For Each word In keywords
        If InStr(1, Body, word, vbTextCompare) > 1 Then
            checkEmail = True
            Exit For
        End If
     Next word
End Function

To provide more detail. I would receive hundreds of emails which contain the following text:

Delivery has failed to these recipients or distribution lists:

John.Doe@abc.com
The recipient's e-mail address was not found in the recipient's e-mail system. Microsoft Exchange will not try to redeliver this message for you. Please check the e-mail address and try resending this message, or provide the following diagnostic text to your system administrator.

Morgan.Freedman@abc.com
The recipient's e-mail address was not found in the recipient's e-mail system. Microsoft Exchange will not try to redeliver this message for you. Please check the e-mail address and try resending this message, or provide the following diagnostic text to your system administrator.

Michael.Jordan@abc.com
The recipient's e-mail address was not found in the recipient's e-mail system. Microsoft Exchange will not try to redeliver this message for you. Please check the e-mail address and try resending this message, or provide the following diagnostic text to your system administrator.

The code above is able to grab the first email address in the email body text (i.e. John.Doe@abc.com), but doesn't see the other two email addresses...

The rest of the code works flawlessly. It exports the email addresses that it does find into Excel.


Solution

  • While being still new to the RegEx function, I blindly altered the code slightly.

    I changed the RegEx.Global boolean to True and this code will work flawlessly.

    With RegEx
       yadda yadda yadda
       .Global = True
    End With
    

    Well - thanks in any case. Hope this helps other people!!!