I need to extract email id from each row of specific domain extensions like .com .net .org everything else should be ignored. Below is the sample data of two rows.
.@.3,.@.1601466914865855,.@.,.@.null,.@.,abc@xyz.com,abc@xyz.net,abc@xyz.org,null.val@.@.,.@@,abc@xyz.jpb,abc@xyz.xls,abc@xyz.321
.@.3,.@.1601466914865855,.@.,.@.null,.@.,123@hjk.com,123@hjk.net,123@hjk.org,null.val@.@.,.@@,abc@xyz.jpb,abc@xyz.xls,abc@xyz.321
Whatever the first valid extension email matches is enough even though there are multiple id's only one email id is enough per row. Below is the sample desired result.
I believe this can be done with custom formula with regex but I can't wrap my head around it. I am using Desktop MS Excel latest version.
If your email addresses are relatively simple, you can use this regex:
\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b
In VBA:
Option Explicit
Function extrEmail(S As String) As String
Dim RE As Object, MC As Object
Const sPat As String = "\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b"
Set RE = CreateObject("vbscript.regexp")
With RE
.Pattern = sPat
.ignorecase = True
.Global = False
.MultiLine = True
If .test(S) = True Then
Set MC = .Execute(S)
extrEmail = MC(0)
End If
End With
End Function
Matching an email address can become very complicated, and a regex that follows all the rules is extraordinarily complex and long. But this one is relatively simple, and might work for your needs.
Explanation of Regex
Emailaddress1
\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b
Options: Case insensitive; ^$ match at line breaks
\b
[A-Z0-9._%+-]+
@
[A-Z0-9.-]+
\.
[A-Z]{2,}
\b
Created with RegexBuddy
EDIT: To match only specific domains, merely replace the part of the regex that matches domains with a group of pipe-separated domain names.
eg
\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.(?:com|net|org)\b