Search code examples
c#stringencodingbase64decoding

Encoding string from reading email


I am using Gmail API to read emails from Gmail account.

In the body I am replacing some chars which are needed as I read in the forums:

 String codedBody = body.Replace("-", "+");
 codedBody = codedBody.Replace("_", "/");

Problem is that when I try to convert it

byte[] data = Convert.FromBase64String(codedBody);

there is an exception which is firing with some emails:

System.FormatException: 'The input is not a valid Base-64 string as it contains a non-base 64 character, more than two padding characters, or an illegal character among the padding characters.'

The string which is coming from the request is:

"0J7QsdGP0LLQsDogSGVhbHRoY2FyZSBTZXJ2aWNlIFJlcHJlc2VudGF0aXZlIHdpdGggRHV0Y2gsIEdlcm1hbiANCiDQktCw0LbQvdC-ISDQnNC-0LvRjywg0L3QtSDQvtGC0LPQvtCy0LDRgNGP0LnRgtC1INC90LAg0YLQvtC30LggZW1haWwuICANCiAg0KLQvtC30LggZW1haWwg0LUg0LjQt9C_0YDQsNGC0LXQvSDQv9GA0LXQtyBqb2JzLmJnINC-0YIg0LjQvNC10YLQviDQvdCwINCa0YDQuNGB0YLQuNCw0L0g0JrRitC90LXQsiAg0JfQsCDQtNCwINGB0LUg0YHQstGK0YDQttC10YLQtSDRgSDQutCw0L3QtNC40LTQsNGC0LAg0YfRgNC10LcgZW1haWwg0LjQt9C_0L7Qu9C30LLQsNC50YLQtToga3Jpc3RpYW5fdG9uaUBhYnYuYmcgIA0KICDQodGK0L7QsdGJ0LXQvdC40LUg0L7RgiDQutCw0L3QtNC40LTQsNGC0LA6ICANCiAg0LHQu9Cw0LHQu9Cw0LHQu9Cw0LHQu9CwDQoNCg0KDQoNCg0KICA=PEhUTUw-PEJPRFk-DQrQntCx0Y_QstCwOiBIZWFsdGhjYXJlIFNlcnZpY2UgUmVwcmVzZW50YXRpdmUgd2l0aCBEdXRjaCwgR2VybWFuPGRpdj48YnI-PGRpdj7QktCw0LbQvdC-ISDQnNC-0LvRjywg0L3QtSDQvtGC0LPQvtCy0LDRgNGP0LnRgtC1INC90LAg0YLQvtC30LggZW1haWwuPC9kaXY-PGRpdj48YnI-PC9kaXY-PGRpdj7QotC-0LfQuCBlbWFpbCDQtSDQuNC30L_RgNCw0YLQtdC9INC_0YDQtdC3IGpvYnMuYmcg0L7RgiDQuNC80LXRgtC-INC90LAg0JrRgNC40YHRgtC40LDQvSDQmtGK0L3QtdCyPC9kaXY-PGRpdj7Ql9CwINC00LAg0YHQtSDRgdCy0YrRgNC20LXRgtC1INGBINC60LDQvdC00LjQtNCw0YLQsCDRh9GA0LXQtyBlbWFpbCDQuNC30L_QvtC70LfQstCw0LnRgtC1OiBrcmlzdGlhbl90b25pQGFidi5iZzwvZGl2PjxkaXY-PGJyPjwvZGl2PjxkaXY-0KHRitC-0LHRidC10L3QuNC1INC-0YIg0LrQsNC90LTQuNC00LDRgtCwOjwvZGl2PjxkaXY-PGJyPjwvZGl2PjxkaXY-0LHQu9Cw0LHQu9Cw0LHQu9Cw0LHQu9CwPGJyPjxicj48YnI-PGJyPjxicj48YnI-PC9kaXY-PC9kaXY-PC9CT0RZPjwvSFRNTD4NCg=="

What is causing this problem?


Solution

  • Your source Base64 string is not valid. It contains a padding character = at position 604 in the middle of the string.

    It appears as if you have two valid Base64 string that have been concatenated together. Go back to your source and ensure that you're collecting them correctly.

    The source has to provide some detail on this as Base64 itself provides no means to determine if you have two values joined like this. If the first source byte array had a length which was a multiple of 3, there would be no padding character in the middle, and it would have decoded successfully and given garbage.

    For what it's worth, replacing those characters appears to be correct as there is no de-facto standard for which two symbols characters are used in Base64. However, make sure you've gotten them right way around.

    Update

    Having investigated further (learning is fun) there is a defined Base64 standard, which defines two separate Base64 encodings.

    The Base 64 Alphabet defines + and / for the two symbols, and = for the padding character.

    The same RFC also specifies The "URL and Filename safe" Base 64 Alphabet which uses - and _ for the two symbols, and = (or %3D) for the padding character.

    It appears your source data uses the "URL and Filename safe" format, while FromBase64String() only accepts the normal format. Therefore you are quite correct to replace - with + and _ with / to convert from one to the other.