Search code examples
c#.netparsingemail-address

Best way to parse string of email addresses


So i am working with some email header data, and for the to:, from:, cc:, and bcc: fields the email address(es) can be expressed in a number of different ways:

First Last <[email protected]>
Last, First <[email protected]>
[email protected]

And these variations can appear in the same message, in any order, all in one comma separated string:

First, Last <[email protected]>, [email protected], First Last <[email protected]>

I've been trying to come up with a way to parse this string into separate First Name, Last Name, E-Mail for each person (omitting the name if only an email address is provided).

Can someone suggest the best way to do this?

I've tried to Split on the commas, which would work except in the second example where the last name is placed first. I suppose this method could work, if after i split, i examine each element and see if it contains a '@' or '<'/'>', if it doesn't then it could be assumed that the next element is the first name. Is this a good way to approach this? Have i overlooked another format the address could be in?


UPDATE: Perhaps i should clarify a little, basically all i am looking to do is break up the string containing the multiple addresses into individual strings containing the address in whatever format it was sent in. I have my own methods for validating and extracting the information from an address, it was just tricky for me to figure out the best way to separate each address.

Here is the solution i came up with to accomplish this:

String str = "Last, First <[email protected]>, [email protected], First Last <[email protected]>, \"First Last\" <[email protected]>";

List<string> addresses = new List<string>();
int atIdx = 0;
int commaIdx = 0;
int lastComma = 0;
for (int c = 0; c < str.Length; c++)
{
    if (str[c] == '@')
        atIdx = c;

    if (str[c] == ',')
        commaIdx = c;

    if (commaIdx > atIdx && atIdx > 0)
    {
        string temp = str.Substring(lastComma, commaIdx - lastComma);
        addresses.Add(temp);
        lastComma = commaIdx;
        atIdx = commaIdx;
    }

    if (c == str.Length -1)
    {
        string temp = str.Substring(lastComma, str.Legth - lastComma);
        addresses.Add(temp);
    }
}

if (commaIdx < 2)
{
    // if we get here we can assume either there was no comma, or there was only one comma as part of the last, first combo
    addresses.Add(str);
}

The above code generates the individual addresses that i can process further down the line.


Solution

  • Here is the solution i came up with to accomplish this:

    String str = "Last, First <[email protected]>, [email protected], First Last <[email protected]>, \"First Last\" <[email protected]>";
    
    List<string> addresses = new List<string>();
    int atIdx = 0;
    int commaIdx = 0;
    int lastComma = 0;
    for (int c = 0; c < str.Length; c++)
    {
    if (str[c] == '@')
        atIdx = c;
    
    if (str[c] == ',')
        commaIdx = c;
    
    if (commaIdx > atIdx && atIdx > 0)
    {
        string temp = str.Substring(lastComma, commaIdx - lastComma);
        addresses.Add(temp);
        lastComma = commaIdx;
        atIdx = commaIdx;
    }
    
    if (c == str.Length -1)
    {
        string temp = str.Substring(lastComma, str.Legth - lastComma);
        addresses.Add(temp);
    }
    }
    
    if (commaIdx < 2)
    {
        // if we get here we can assume either there was no comma, or there was only one comma as part of the last, first combo
        addresses.Add(str);
    }