Search code examples
xmldomxml-parsingsimplexmlphpexcel

Getting simplexml to understand input that have no self closing tags?


I need to get a HTML table, containing self closing input tags, that is posted from the client to the server and parse it as simpleXML, in order to convert it to an Excel file via PHPExcel.

The problem is that the browser is striping away the self closing tags from the input tags which in turn produces an error when I'm passing it through to the simplexml_load_string function.

$table = '<table><tr><td><input name="test" value="1" type="checkbox" ></td></tr></table>';
$xml = simplexml_load_string($table);

If I could stop the browser changing the code from:

<input name="test" value="1" type="checkbox" />

to:

<input name="test" value="1" type="checkbox" >

That would solve my problem but I don't know or can't seem to find out how to do this?

Is there a way to allow simplexml_load_string to accept input that has no self closing tags or even if there is something else that I'm missing?

http://phpfiddle.org/main/code/bw3x-zvtw


Solution

  • There is a trick for this: the DOM extension can parse HTML, including unclosed tags like you have here; and SimpleXML can "import" a DOM object (without actually reparsing anything, because they use the same memory structure underneath).

    It ought to be as simple as:

    $dom = new DOMDocument;
    $dom->loadHTML($html);
    $sx = simplexml_import_dom($dom);