Search code examples
excelvbaoutlookextract

Extracting Various Formats of Phone Numbers from Outlook Bounced E-mails


My co-workers have a bottleneck. Updating contact information in our CRM via bounced e-mails. They have a LOT of emails to wade through considering many are just "out of office" emails.

Here is the full code I have so far:

'Enable Microsoft Outlook 16.0 Object Library in Tools>>>References
Option Explicit
Sub GetFromOutlook()

Dim OutlookApp As Outlook.Application
Dim OutlookNamespace As Namespace
Dim Folder As MAPIFolder
Dim OutlookMail As Variant
Dim i As Integer
Dim RowCount As Integer

Set OutlookApp = New Outlook.Application
Set OutlookNamespace = OutlookApp.GetNamespace("MAPI")
Set Folder = OutlookNamespace.GetDefaultFolder(olFolderInbox).Folders("Contact Info")

Range("a1").Select
Range("a1").Value = "eMail_subject"
Range("b1").Value = "eMail_date"
Range("c1").Value = "eMail_sender"
Range("d1").Value = "eMail_text"

i = 1

For Each OutlookMail In Folder.Items
    'If OutlookMail.ReceivedTime >= Range("From_date").Value Then
        ActiveCell.Offset(i, 0).Value = OutlookMail.Subject
        ActiveCell.Offset(i, 1).Value = OutlookMail.ReceivedTime
        ActiveCell.Offset(i, 2).Value = OutlookMail.SenderName
        ActiveCell.Offset(i, 3).Value = OutlookMail.Body
        
        i = i + 1
    'End If
Next OutlookMail

Set Folder = Nothing
Set OutlookNamespace = Nothing
Set OutlookApp = Nothing

RowCount = WorksheetFunction.CountA(Range("a2:a1000000")) + 1


'Creating Cleaned Message Column--------------------
Range("e1").Value = "Cleaned Message"

Range("e2").Select

ActiveCell.FormulaR1C1 = _
    "=TRIM(SUBSTITUTE(SUBSTITUTE(RC[-1],CHAR(13),""""),CHAR(10),""""))"
    
Selection.AutoFill Destination:=Range(ActiveCell, Cells(RowCount, ActiveCell.Column))

Range("e1", Cells(RowCount, ActiveCell.Column)).Select

Selection.Copy

Selection.PasteSpecial Paste:=xlPasteValues, Operation:=xlNone, SkipBlanks _
    :=False, Transpose:=False

'Creating Message Status Column---------------------
Range("f1").Value = "Message Status"

Range("f2").Select

ActiveCell.Formula2R1C1 = _
    "=IFS(ISNUMBER(FIND(""retire"",lower(RC[-1]))),""Retired"",ISNUMBER(FIND(""no longer with"",lower(RC[-1]))),""No Longer With"",ISNUMBER(FIND(""no longer employed"",lower(RC[-1]))),""No Longer Employed"",ISNUMBER(FIND(""out of the office"",lower(RC[-1]))),""Out of the office"",ISNUMBER(FIND(""out of office"",lower(RC[-1]))),""Out of the Office"",ISNUMBER(FIND(""vacation"",lower(RC[-1]))),""On Vacation"",ISNUMBER(FIND(""out of the facility"",lower(RC[-1]))),""Out of the Office"",ISNUMBER(FIND(""unavailable"",lower(RC[-1]))),""Out of the Office""" _
    & ",ISNUMBER(FIND(""office will be close"",lower(RC[-1]))),""Office(s) Closed"",ISNUMBER(FIND(""office is closed"",lower(RC[-1]))),""Office(s) Closed"",ISNUMBER(FIND(""offices are closed"",lower(RC[-1]))),""Office(s) Closed"",ISNUMBER(FIND(""unable to respond"",lower(RC[-1]))),""Out of the Office"",ISNUMBER(FIND(""I will be out"",lower(RC[-1]))),""Out of the Office"",ISNUMBER(FIND(""away from my computer"",lower(RC[-1]))),""Away From Computer"",ISNUMBER(FIND(""away from computer"",lower(RC[-1]))),""Away From Computer"",ISNUMBER(FIND(""time off"",lower(RC[-1]))),""Vacation""" _
    & ",ISNUMBER(FIND(""time-off"",lower(RC[-1]))),""Vacation"",ISNUMBER(FIND(""deactivate"",lower(RC[-1]))),""Deactivated"",ISNUMBER(FIND(""closed for the holiday"",lower(RC[-1]))),""Office(s) Closed"",ISNUMBER(FIND(""working off-site"",lower(RC[-1]))),""Off-site"",ISNUMBER(FIND(""working off site"",lower(RC[-1]))),""Off-site"",ISNUMBER(FIND(""business trip"",lower(RC[-1]))),""Out of the Office"")"
    
Selection.AutoFill Destination:=Range(ActiveCell, Cells(RowCount, ActiveCell.Column))

Range("L1", Cells(RowCount, ActiveCell.Column)).Select

Selection.Copy

Selection.PasteSpecial Paste:=xlPasteValues, Operation:=xlNone, SkipBlanks _
    :=False, Transpose:=False

'Creating Phone Number Extract Column--------------
Range("G1").Select

ActiveCell.Value = "Phone Number 1"
ActiveCell.Offset(0, 1).Value = "Phone Number 2"
ActiveCell.Offset(0, 2).Value = "Phone Number 3"
ActiveCell.Offset(0, 3).Value = "Phone Number 4"
ActiveCell.Offset(0, 4).Value = "Phone Number 5"
ActiveCell.Offset(0, 5).Value = "Phone Number 6"

Call PhoneExtract
    
'Formatting all cells------------------------------
Range("a1", Cells(RowCount, 12)).Select

With Selection
    .HorizontalAlignment = xlLeft
    .VerticalAlignment = xlTop
    .WrapText = False
    .Orientation = 0
    .AddIndent = False
    .IndentLevel = 0
    .ShrinkToFit = False
    .ReadingOrder = xlContext
    .MergeCells = False
End With

Columns("A:A").Select
Selection.ColumnWidth = 25
Columns("B:B").EntireColumn.AutoFit
Columns("C:C").EntireColumn.AutoFit
Columns("D:D").Select
Selection.ColumnWidth = 25
Columns("E:F").Select
Selection.ColumnWidth = 80
Columns("G:L").Select
Selection.ColumnWidth = 25

Range("A1").Select

ActiveSheet.Range("A1").AutoFilter

Call Mail_workbook_Outlook_1

MsgBox "Macro has completed!"

End Sub
'---------------------------------------------------------------------------------------
'Option Explicit

Sub PhoneExtract()
    Dim str As String, n As Long, rw As Long
    Dim rgx As Object, cmat As Object, ws As Worksheet

    Set rgx = CreateObject("VBScript.RegExp")
    Set ws = Worksheets(ActiveSheet.Name)
    
    Range("G2").Select

    With rgx
        .Global = True
        .MultiLine = True
        'phone number pattern is: ###-###-####
        .Pattern = "[0-9,\-]{12}"
        For rw = 2 To ws.Cells(Rows.Count, "A").End(xlUp).Row
            str = ws.Cells(rw, "E").Value2
            If .Test(str) Then
                Set cmat = .Execute(str)
                'populate the worksheet with the matches
                For n = 0 To cmat.Count - 1
                    If Left(cmat.Item(n).Value, 2) = "1-" Then
                        ws.Cells(rw, Columns.Count).End(xlToLeft).Offset(0, 1) = ""
                    Else
                        ws.Cells(rw, Columns.Count).End(xlToLeft).Offset(0, 1) = cmat.Item(n)
                    End If
                Next n
            End If
        Next rw
    End With

    Set rgx = Nothing: Set ws = Nothing
    
    Set rgx = CreateObject("VBScript.RegExp")
    Set ws = Worksheets(ActiveSheet.Name)
    
    With rgx
        .Global = True
        .MultiLine = True
        'phone number pattern is: ###-###-####
        .Pattern = "[0-9,\-]{14}"
        For rw = 2 To ws.Cells(Rows.Count, "A").End(xlUp).Row
            str = ws.Cells(rw, "E").Value2
            If .Test(str) Then
                Set cmat = .Execute(str)
                'populate the worksheet with the matches
                For n = 0 To cmat.Count - 1
                    If Left(cmat.Item(n).Value, 2) = "1-" Then
                        ws.Cells(rw, Columns.Count).End(xlToLeft).Offset(0, 1) = Mid(cmat.Item(n).Value, 3, 500)
                    End If
                Next n
            End If
        Next rw
    End With

    Set rgx = Nothing: Set ws = Nothing
    
    Set rgx = CreateObject("VBScript.RegExp")
    Set ws = Worksheets(ActiveSheet.Name)
    
    With rgx
        .Global = True
        .MultiLine = True
        'phone number pattern is: ###-###-####
        .Pattern = "[0-9,\-]{8}"
        For rw = 2 To ws.Cells(Rows.Count, "A").End(xlUp).Row
            str = ws.Cells(rw, "E").Value2
            If .Test(str) Then
                Set cmat = .Execute(str)
                'populate the worksheet with the matches
                For n = 0 To cmat.Count - 1
                    If Len(cmat.Item(n).Value) < 9 And Mid(cmat.Item(n).Value, 8, 1) <> "-" Then
                        ws.Cells(rw, Columns.Count).End(xlToLeft).Offset(0, 1) = cmat.Item(n) 'Mid(cmat.Item(n).Value, 3, 500)
                    End If
                Next n
            End If
        Next rw
    End With

    Set rgx = Nothing: Set ws = Nothing

End Sub
'---------------------------------------------------------------------------------------------
Sub Mail_workbook_Outlook_1()
'Working in Excel 2000-2016
'This example send the last saved version of the Activeworkbook
'For Tips see: http://www.rondebruin.nl/win/winmail/Outlook/tips.htm
    Dim OutApp As Object
    Dim OutMail As Object

    Set OutApp = CreateObject("Outlook.Application")
    Set OutMail = OutApp.CreateItem(0)

    On Error Resume Next
    With OutMail
        .To = "<co-worker email>"
        .CC = "<co-worker email>"
        .BCC = ""
        .Subject = "Automating Contact Info Updates in Tdf " & Date
        .Body = "This is an automated message that is only sent to specified recipients when an Excel Macro is run for the purpose specified in the subject line."
        .Attachments.Add ActiveWorkbook.FullName
        'You can add other files also like this
        '.Attachments.Add ("C:\test.txt")
        .Send   'or use .Display
    End With
    On Error GoTo 0

    Set OutMail = Nothing
    Set OutApp = Nothing
End Sub

The only issue I have (so far) is that the phone numbers included in the body of the emails are all formatted differently. I have managed to account for country code and no country code as well as 12 digit (10 digits with 2 dashes) versus 8 digits (7 digits with one dash), but I am running into issues trying to account for 10 digits without any dashes (and probably 8 digits without dashes).

For example, in the body of the email I will have a number like 5853182096, but it will only give me the first 8 digits.

Also, some other random sequences of digits will be identified by the procedure as phone numbers. For example, one says "Electronic Communications Privacy Act, 18 U.S.C. Sections 2510-2521" and the phone number output is 2510-252. Is there a way to exclude those as well. I am assuming I will just have to do a string search with digit wildcards at the end and adjust the script as I see them. If there is an easier way then that would be great.

Regardless, let me know your thoughts. As always, any and all help is accepted and thank you for the support!


Solution

  • You can create regex patterns that better match your phone number variations. Your comments only indicate a single pattern (###-###-####) your three regexes will return many strings that do not match that pattern. To match that particular pattern, I'd suggest \b\d{3}-\d{3}-\d{4}\b but that might be too restrictive. You really need to look at the possible patterns more closely. Given the patterns in your code, in addition to the mismatches you mention, one of them would also match 1,,456---89147 clearly not a phone number.

    I don't know if the regex is your only problem. Also, I don't understand (at least with North American phone numbers, your possible pattern of 8 digits. I could understand 7 digits. For North American phone numbers, not taking into account international numbers, the following regex will match, including that 10 digit string; and won't match the USC citation:

    \b(?:[2-9]\d{2}-?)?\d{3}-?\d{4}\b 
    

    (North American phone numbers cannot start with a [01]).

    Other countries have different patterns.

    Here is an explanation of the regex, with links to details on the items:

    North American style phone numbers

    \b(?:[2-9]\d{2}-?)?\d{3}-?\d{4}\b
    

    Options: Case sensitive; ^$ don’t match at line breaks

    Created with RegexBuddy