Search code examples
phpxmlreader

PHP XMLReader get all node names


How to get all the unique node names in XMLReader? Let's say for example I have below XML data:

<a>
    <b>
        <firstname>John</firstname>
        <lastname>Doe</lastname>
    </b>
    <c>
        <firstname>John</firstname>
        <lastname>Smith</lastname>
        <street>Streetvalue</street>
        <city>NYC</city>
    </c>
    <d>
        <street>Streetvalue</street>
        <city>NYC</city>
        <region>NY</region>
    </d>
</a>

How can I get firstname, lastname, street, city, region from above XML data using XMLReader? Also the file is very big, so need to see performance also while getting the node names.

Thanks


Solution

  • I didn't get the chance to test it, but give this a try:

    $reader = new XMLReader();
    $reader->open($input_file);
    $nodeList = array();
    
    while ($reader->read())
    {
    
        // We need to check if we're dealing with an Element
        if ($reader->nodeType == XMLReader::ELEMENT && $reader->name == 'b')
        {
            // Let's inspect the node's content as well
            while ($reader->read())
            {
                if ($reader->nodeType == XMLReader::ELEMENT)
                {
                     // Saving the node to an auxiliar array
                     array_push($nodeList, $reader->localName);
                }
            }
    }
    
    // Finally, let's filter the array
    $nodeList = array_unique($nodeList);
    

    Performance-wise, if the file is huge then XMLReader is the way to go since it only loads the current tag to memory (while DOMDocument, on the other hand, would load everything). Here's a more detailed explanation regarding the three techniques you can use to read the XML.

    By the way, if the array containing the nodes grows too large run array_unique more periodically (instead of just doing it in the end), in order to trim it.