Search code examples
c#emailgmail-apimimekit

Strange characters turning up in generated email messages


Using GMail api to send emails from within various apps. Use MimeKit to generate message content. The issue is when I generate my message and send it off, I always get a number of instances of  character appearing in the output. These characters all appear in the signature block of the email message in front of the phone numbers.

I wrote a sample application, .Net Core Console App, which demonstrates the problem:

using Google.Apis.Auth.OAuth2;
using Google.Apis.Gmail.v1;
using Google.Apis.Gmail.v1.Data;
using Google.Apis.Services;
using Google.Apis.Util.Store;
using MimeKit;
using System.Text;


string[] Scopes = {"https://www.googleapis.com/auth/userinfo.email",
        "https://www.googleapis.com/auth/gmail.send",
        "https://www.googleapis.com/auth/gmail.settings.basic"};
string ApplicationName = "EMailIntegration";
string mySignature = string.Empty;
string myEMail = string.Empty;
string template = "<!DOCTYPE html><html><head><meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\" /><style>body { font-family: Arial, sans-serif; color: #231F20;}</style></head><body>%body%<br>%signature%</body></html>";

UserCredential credential = null;

var success = await DoAuthenticationAsync();

if (success == false)
{
    Console.WriteLine("Authentication failed!");
    Console.ReadLine();
    return;
}
else
{
    Console.WriteLine($"Authenticated as {myEMail}");
}

//call your gmail service
var service = new GmailService(new BaseClientService.Initializer() { HttpClientInitializer = credential, ApplicationName = ApplicationName });

//create the Google message object
var msg = new Google.Apis.Gmail.v1.Data.Message();

string[] sendTo = { "notmyemail@gmail.com" };   

MimeKit.MimeMessage mimeMsg = new MimeKit.MimeMessage();

//add the message destinations
foreach (string dest in sendTo)
{
    mimeMsg.To.Add(MimeKit.MailboxAddress.Parse(dest));
}

var msgBlob = template.Replace("%body%", "<p>This is a test to see if we can eliminate issues with strange characters appearing in signature blocks.</p>");
msgBlob = msgBlob.Replace("%signature%", mySignature);

mimeMsg.Subject = "Email Integration Testing";

mimeMsg.Body = new TextPart(MimeKit.Text.TextFormat.Html)
{
    Text = msgBlob
};

msg.Raw = Base64UrlEncode(mimeMsg.ToString());

//send the message
service.Users.Messages.Send(msg, "me").Execute();

Console.WriteLine($"Sent!");

Console.ReadLine();

string Base64UrlEncode(string input)
{
    var data = Encoding.UTF8.GetBytes(input);
    return Convert.ToBase64String(data)
        .Replace("+", "-")
        .Replace("/", "_")
        .Replace("=", "");
}

//The following code is used to authenticate with Google API, obtain the users email address and signature block.
async Task<bool> DoAuthenticationAsync()
{
    ClientSecrets secrets;

    // this obtains the application secrets data we need to indicate our application is asking to authenticate.
    using (var strm = new FileStream("credentials.json", FileMode.Open, FileAccess.Read))
    {
        secrets = GoogleClientSecrets.FromStream(strm).Secrets;
    }

    try
    {
        // this is the magic black box that does the authenticating.
        credential = await GoogleWebAuthorizationBroker.AuthorizeAsync(secrets, Scopes, "user", CancellationToken.None, new FileDataStore("Google.API.Auth", false));

        var init = new BaseClientService.Initializer();
        init.HttpClientInitializer = credential;

        var svc = new GmailService(init);

        // this grabs the list of all e-mail aliases for the signed-in user and selects the primary
        ListSendAsResponse result = svc.Users.Settings.SendAs.List("me").Execute();

        foreach (SendAs itm in result.SendAs)
        {
            if (itm.IsPrimary.HasValue)
            {
                if (itm.IsPrimary == true)
                {
                    // save as the signature blob to use.
                    mySignature = itm.Signature;
                    myEMail = itm.SendAsEmail;
                    break;
                }
            }
        }
    }
    catch (Exception)
    {
        return false;
    }

    return true;
}

The signature block that is pulled from Google looks "similar" to this one:

<div dir="ltr">
    <table style="color:rgb(35,31,32);font-family:Arial,sans-serif;font-size:medium;width:554px">
        <tbody>
            <tr>
                <td width="275px">
                    <div style="font-size:14pt;font-weight:bold">Shane Brodie</div>
                    <div style="font-family:Aral,sans-serif;font-size:10pt;margin-bottom:10px">Senior Programmer/Analyst</div>
                    <div style="font-size:10pt;margin-bottom:4px"><strong>Toll Free:</strong> 204-555-1234</div>
                    <div style="font-size:10pt;margin-bottom:4px"><strong>Office:</strong> 204-555-1234</div>
                    <div style="font-size:10pt;margin-bottom:4px"><b>Extension:</b> 123</div>
                    <div style="font-size:10pt;margin-bottom:4px"><b>Cell Phone:</b> 204-555-1234</div>
                    <div style="font-size:10pt;margin-bottom:4px"><a href="mailto:notmyemail@gmail.com" target="_blank">notmymail@gmail.com</a></div>
                </td>
                <td style="width:4px;border-right:4px solid rgb(65,65,66);height:142px">
                </td>
                <td width="275px">
                    <div style="padding-bottom:5px"><a href="https://maverickind.ca/" target="_blank"><img src="http://maverickind.ca/wp-content/uploads/2023/11/maverick.png" width="200px" style="margin-bottom:10px;font-style:italic"></a></div>
                    <div style="display:table">
                        <div style="display:table-row">
                            <div style="display:table-cell"><a href="http://www.intersteel.ca/" title="Intersteel" target="_blank"><img src="http://maverickind.ca/wp-content/uploads/2023/12/INT-Sig-e1707227175846.png" style="width:40.5px;height:40.5px"></a></div>
                            <div style="display:table-cell"><a href="https://defenderpumpguard.com/" title="Defender" target="_blank"><img src="http://maverickind.ca/wp-content/uploads/2023/12/Def-Fist-Sig-e1707227150848.png" style="width:40.5px;height:40.5px"></a></div>
                            <div style="display:table-cell"><a href="http://www.maverickind.ca/" title="Maverick Industries Ltd." target="_blank"><img src="http://maverickind.ca/wp-content/uploads/2023/12/Mav-Horns-SIG-e1707227104191.png" style="width:40.5px;height:40.5px"></a></div>
                        </div>
                    </div>
                </td>
            </tr>
        </tbody>
    </table>
</div>

I have sanitized the data slightly and formatted it for easier viewing. Any ideas would be greatly appreciated. I've been fighting through this issue for weeks.


Solution

  • Don't use MimeMessage.ToString() to serialize the message.

    Use MimeMessage.WriteTo() instead.

    using (var memory = new MemoryStream ()) {
        var format = FormatOptions.Default.Clone ();
        format.NewLineFormat = NewLineFormat.Dos;
    
        mimeMsg.WriteTo (format, memory);
    
        var msgData = memory.ToArray ();
    
        msg.Raw = Base64UrlEncode (msgData);
    }
    
    string Base64UrlEncode(byte[] data)
    {
        return Convert.ToBase64String(data)
            .Replace("+", "-")
            .Replace("/", "_")
            .Replace("=", "");
    }
    

    The issue is that MimeMessage.ToString() internally uses MimeMessage.WriteTo() but then has to convert that to a string. Since messages can contain binary data or text in any charset, it uses iso-8859-1 to convert the raw byte[] to a string. If your message body contains Unicode or non-Latin1 characters, this will end up producing those 'Â' characters.