Search code examples
phpxmlparsingwebex

PHP XML response start tag expected, but i see it in var_dump


I have the following being returned

var_dump:

string(799) "<?xml version="1.0" encoding="ISO-8859-1"?> <serv:message xmlns:serv="http://www.webex.com/schemas/2002/06/service" xmlns:com="http://www.webex.com/schemas/2002/06/common" xmlns:att="http://www.webex.com/schemas/2002/06/service/attendee"><serv:header><serv:response><serv:result>SUCCESS</serv:result><serv:gsbStatus>BACKUP</serv:gsbStatus></serv:response></serv:header><serv:body><serv:bodyContent xsi:type="att:registerMeetingAttendeeResponse" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><att:register><att:attendeeID>29281003</att:attendeeID></att:register></serv:bodyContent></serv:body></serv:message>"

i'm trying to use SimpleXML, but i'm first validating output with this function (sorry, can't remember where i found it on stackoverflow):

function isXML($xml){
           libxml_use_internal_errors(true);

           $doc = new DOMDocument('1.0', 'utf-8');
           $doc->loadXML($xml);

           $errors = libxml_get_errors();

           if(empty($errors)){
               return true;
           }

           $error = $errors[0];
           if($error->level < 3){
               return true;
           }

           $explodedxml = explode("r", $xml);
           $badxml = $explodedxml[($error->line)-1];

           $message = $error->message . ' at line ' . $error->line . '. Bad XML: ' . htmlentities($badxml);
           return $message;
        }

result of isXML()

Start tag expected, '<' not found at line 1. Bad XML: &lt;?xml ve

I see the '<', unless the var_dump is inaccurate. I've broken this thing down as much as I could. Any help would be greatly appreciated.


Solution

  • I stripped the problem down a little more:

    $xml = '<?xml version="1.0" encoding="ISO-8859-1"?>
    <serv:message xmlns:serv="http://www.webex.com/schemas/2002/06/service"/>';
    
    // escape xml special chars - this will provoke the error
    $xml = htmlspecialchars($xml);
    
    $document = new DOMDocument();
    $document->loadXml($xml);
    

    Output:

    Warning: DOMDocument::loadXML(): Start tag expected, '<' not found in Entity, line: 1 in /tmp/... 
    

    What happens it that your XML is still escaped/encoded. You do not see that in the browser because the special characters are interpreted by it. It treats the response (including the var_dump()) as HTML. Open the source view to check the actual value.

    Debug the source that reads the XML string, you might want to change it or add a html_entity_decode() there.

    HINT: You're XML uses namespaces, so you might better off with DOM + Xpath. Check out DOMXpath::evaluate().