I am currently writing a C# program that parses a SQL Server query report for data that it then saves for later use. So far, the queries I've been parsing have been very straightforward, but I just received a test file that contains this entry:
Client Contact Client Email
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Primary: Nelli Quiroga
, nelli.quiroga@parservicios.com
Secondary: Zully Aranda, zully.aranda@parservicios.com
nelli.quiroga@parservicios.com|zully.aranda@parservicios.com
(Please excuse the formatting - that's exactly how the query outputs it.)
I need to be able to pull the names and email addresses of both the Primary and Secondary clients, which will then be stored in a List<Tuple<string, string>>
. So far, I haven't been able to devise a solution that does it efficiently, if at all. My best guess is that I would have to use Regex somehow, but I'm not familiar with how Regex works in C#, and my regex skills are very rusty.
Some additional information:
This entry(ies) will be part of a query report that contains other table queries, which thankfully are all formatted like so:
User_Name Group_Name
-------------------------------------------------- --------------------------------------------------
msteffl NULL
spurcell Admin
wgervais Admin
djames Admin
loaduser Admin
mbreu Admin
wgervais ComplianceContractors
wgervais ComplianceOfficers
sgregory ComplianceOfficers
ntabares ComplianceOfficers
lduffaut ComplianceOfficers
pdeneree ComplianceOfficers
serickson ComplianceOfficers
mbreu ComplianceOfficers
jreinhardt ComplianceOfficers
jromoser ComplianceOfficers
ebell ComplianceOfficers
bkeogh ComplianceOfficers
cbarnett ComplianceOfficers
cbarnett Users
bkeogh Users
ebell Users
djames Users
jromoser Users
jreinhardt Users
mbreu Users
serickson Users
pdeneree Users
lduffaut Users
ntabares Users
sgregory Users
spurcell Users
wgervais Users
Any and all input would be appreciated.
Alright, after some trial and error, I was able to write a regular expression that works just fine with the contact output I was given. I cannot guarantee that it will work well with any output file I can/will be given, but for now it does what I need it to do.
This is the regular expression in question:
(?:Primary|Secondary):[ ]*[\r\n]*[a-zA-Z ]+[\r\n]*[ ]*,[ ]*[\r\n]*[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+\.[a-z]+[ ]*[\r\n]*
It is able to break up the entry into two parts, the Primary and Secondary contact, which then allows me to trim and split each contact as needed.
Again, I cannot say for certain that this will work with any given input, but I also think I've managed to make it reasonably foolproof, hence all the instances of [ ]*
and [\r\n]*
.