Search code examples
c#unicodeencodingmtomxop

Bytes read as UTF8 string and converted to Base64


Forgive the lengthy setup here but I thought it may help to have the context...

I am implementing a custom digital signature validation method in as part of a WCF service. We're using a custom method because various differing interpretations of some industry standards but the details there aren't all that relevant.

In this particular scenario, I am receiving an MTOM/XOP encoded request where the root MIME part contains a digital signature and the signature DigestValue and SignatureValue pieces are split up into separate MIME parts.

The MIME parts that contain the signature DigestValue and SignatureValue data is binary encoded so it is literally a bunch of raw bytes in the web request like this:

Content-Id: <[email protected]>
Content-Type: application/octet-stream
Content-Transfer-Encoding: binary

[non-printable-binary-data-goes-here]
--uuid:eda4d7f2-4647-4632-8ecb-5ba44f1a076d

I am reading the contents of the message in as a string (using the default UTF8 encoding) like this (see the requestAsString parameter below):

MessageBuffer buffer = request.CreateBufferedCopy(int.MaxValue);
try
{
    using (MemoryStream mstream = new MemoryStream())
    {
        buffer.WriteMessage(mstream);
        mstream.Position = 0;

        using (StreamReader sr = new StreamReader(mstream))
        {
            requestAsString = sr.ReadToEnd();
        }

        request = buffer.CreateMessage();
    }
}

After I read the MTOM/XOP message in, I am attempting to re-organize the multiple MIME parts into one SOAP message where the signature DigestValue and SignatureValue elements are restored to the original SOAP envelope (and not as attachments). So basically I am taking decoding the MTOM/XOP request.

Unfortunately, I am having trouble reading the DigestValue and SignatureValue pieces correctly. I need to read the bytes out of the message and get the base64 string representation of that data.

Despite all the context above, it seems the core problem is reading the binary data in as a string (UTF8 encoded) and then converting it to a proper base64 representation.

Here is what I am seeing in my test code:

This is my example base64 string:

string base64String = "mowXMw68eLSv9J1W7f43MvNgCrc=";

I can then get the byte representation of that string. This yields an array of 20 bytes:

byte[] base64Bytes = Convert.FromBase64String(base64String);

I then get the UTF8 encoded version of those bytes:

string decodedString = UTF8Encoding.UTF8.GetString(base64Bytes);

Now the strange part... if I convert the string back to bytes as follows, I get an array of bytes that is 39 bytes long:

byte[] base64BytesBack = UTF8Encoding.UTF8.GetBytes(decodedString);

So obviously at this point, when I convert back into a base64 string, it doesn't match the original value:

string base64StringBack = Convert.ToBase64String(base64BytesBack);

base64StringBack is set to "77+977+9FzMO77+9eO+/ve+/ve+/vVbvv73vv703Mu+/vWAK77+9"

What am I doing wrong here? If I switch to using UTF8Encoding.Unicode.GetString() and UTF8Encoding.Unicode.GetBytes(), it works as expected:

string base64String = "mowXMw68eLSv9J1W7f43MvNgCrc=";

// First get an array of bytes from the base64 string
byte[] base64Bytes = Convert.FromBase64String(base64String);

// Get the Unicode representation of the base64 bytes.
string decodedString = UTF8Encoding.Unicode.GetString(base64Bytes);

byte[] base64BytesBack = UTF8Encoding.Unicode.GetBytes(decodedString);

string base64StringBack = Convert.ToBase64String(base64BytesBack);

Now base64StringBack is set to "mowXMw68eLSv9J1W7f43MvNgCrc=" so it seems I am mis-using the UTF8 encoding somehow or it is behaving differently than I would expect.


Solution

  • Ok, I took a different approach to reading the MTOM/XOP message:

    Instead of relying on my own code to parse the MIME parts by hand, I just used XmlDictionaryReader.CreateMtomReader() to get an XmlDictionaryReader and read the message into an XmlDocument (being careful to preserve whitespace on the XmlDocument so digital signatures aren't broken):

    MessageBuffer buffer = request.CreateBufferedCopy(int.MaxValue);
    
    messageContentType = WebOperationContext.Current.IncomingRequest.ContentType;
    
    try
    {
        using (MemoryStream mstream = new MemoryStream())
        {
            buffer.WriteMessage(mstream);
            mstream.Position = 0;
    
            if (messageContentType.Contains("multipart/related;"))
            {
                Encoding[] encodings = new Encoding[1];
                encodings[0] = Encoding.UTF8;
    
                // MTOM
                using (XmlDictionaryReader reader = XmlDictionaryReader.CreateMtomReader(mstream, encodings, messageContentType, XmlDictionaryReaderQuotas.Max))
                {
                    XmlDocument msgDoc = new XmlDocument();
                    msgDoc.PreserveWhitespace = true;
                    msgDoc.Load(reader);
    
                    requestAsString = msgDoc.OuterXml;
    
                    reader.Close();
                }
            }
            else
            {
                // Text
                using (StreamReader sr = new StreamReader(mstream))
                {
                    requestAsString = sr.ReadToEnd();
                }
            }
    
            request = buffer.CreateMessage();
        }
    }
    finally
    {
        buffer.Close();
    }