Search code examples
xmlsoaputf-8exchangewebservices

Including UTF-8 characters in XML CDATA for Exchange Web Services


I am creating calendar entries in Microsoft 365 using EWS. I am making direct http calls and creating my own XML files, no library involved. Within the body of the calendar object (i.e. the notes) I need to include UTF-8 sequences for special characters. I thought I had this working, but now including such encoding generates http error 500 with the message that the data failed schema validation.

Below is an example of the XML that I am creating. Where I've indicated XYZ, assume that those are the 3 bytes xE2, x82, xAC which is the encoding for a Euro character. This fails validation. If I remove those 3 bytes, it works fine. Note that the XML specifies UTF-8 encoding. I've also set the in the http header "Content-Type: text/xml; charset=utf-8". Any idea how I can specify that EWS should handle UTF-8 characters?

<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:t="http://schemas.microsoft.com/exchange/services/2006/types">
<soap:Header>
<t:RequestServerVersion Version="Exchange2010_SP1"/>
</soap:Header>
<soap:Body>
<CreateItem SendMeetingInvitations="SendToNone" xmlns="http://schemas.microsoft.com/exchange/services/2006/messages"
xmlns:t="http://schemas.microsoft.com/exchange/services/2006/types">
<SavedItemFolderId>
<t:FolderId Id="AAMkADNjNjA1MTIxLWNlNmItNDBjMS04NWE0LTQ3ZmM0YTFiMTg4MAAuAAAAAADzAjMnWlbcRo51UWjY2+udAQBtGm0AIaIyTp7trMMKSGyiAAAAzNL0AAA="/>
</SavedItemFolderId>
<Items>
<t:CalendarItem>
<t:Subject>Test Euro character</t:Subject>
<t:Body BodyType="Text">
<![CDATA[
This is a Euro character: XYZ
]]>
</t:Body>
<t:Importance>Normal</t:Importance>
<t:ReminderIsSet>false</t:ReminderIsSet>
<t:Start>2010-08-24T14:30:00Z</t:Start>
<t:End>2010-08-24T15:00:00Z</t:End>
<t:LegacyFreeBusyStatus>Busy</t:LegacyFreeBusyStatus>
<t:Location></t:Location>
</t:CalendarItem>
</Items>
</CreateItem>
</soap:Body>
</soap:Envelope>

Solution

  • Just enter the actual character. CDATA sections have character data, not encoded byte data. For example, I created the following in Notepad++ and saved the file as UTF-8:

    <t:Body><![CDATA[This is a Euro character: €]]></t:Body>
    

    Hex dump of the file:

    hex dump of example

    If you still have trouble, post the code used to build the XML. If not using a library, it is easy to violate XML standards. Use a hexdump program and see what is wrong with your file compared to this example.

    There's nothing in this data that makes a difference using CDATA, so you could just use:

    <t:Body>This is a Euro character: €</t:Body>