Search code examples
phpxmlxml-parsingxmlreader

Is there a way to get the first line using XMLReader?


Is there a way to get the opening tag of the element using xmlreader in PHP?

I have this type of xml:

<Product id="L20" manufacturer="A">
    <Description>Desc</Description>
    <Price>5.00</Price>
</Product>

This is my code. $reader is an XMLReader type.

while($reader->read()) {
    if($reader->nodeType == XMLReader::ELEMENT) {
        //??
    }
}

I want it to get <Product id="L20" manufacturer="A"> as the output.

I want to specify that I want to use XMLReader. Stop suggesting that it's a duplicate of others when they use DOM or simpleXML. I have a large XML file, and the putting it all into memory is not possible on my current system.


Solution

  • You would need to read the name and the attributes of the node using XMLReader methods. It seems that XMLReader::readString() is not implemented for attribute nodes, so you would need to collect the names, navigate back to the element and use XMLReader::getAttribute():

    $xml = <<<'XML'
    <Product id="L20" manufacturer="A">
        <Description>Desc</Description>
        <Price>5.00</Price>
    </Product>
    XML;
    
    $reader = new XMLReader();
    $reader->open('data://text/plain;base64,' . base64_encode($xml));
    
    while ($reader->read()) {
        if ($reader->nodeType === XMLReader::ELEMENT && $reader->localName === 'Product') {
            var_dump($reader->localName);
            $attributeNames = [];
            $found = $reader->moveToFirstAttribute();
            while ($found && $reader->nodeType === XMLReader::ATTRIBUTE) {
                $attributeNames[] = $reader->localName;
                $found = $reader->moveToNextAttribute();
            }
            $reader->moveToElement();
            var_dump(
                array_combine(
                    $attributeNames,
                    array_map(
                        function($name) use ($reader) {
                            return $reader->getAttribute($name);
                        },
                        $attributeNames,
                    )
                )
            );
        }
    }
    

    Output:

    string(7) "Product"
    array(2) {
      ["id"]=>
      string(3) "L20"
      ["manufacturer"]=>
      string(1) "A"
    }
    

    It is possible to combine XMLReader with DOM. Large XML files are lists of items usually. You look for the item node using XMLReader and expand it into DOM for the more complex stuff. If you XML is a list of Product nodes you can iterate and expand them. It will only load the Product node and its descendants into memory at once, allow you to use DOM methods and Xpath expressions.

    $xml = <<<'XML'
    <Products>
    <Product id="L20" manufacturer="A">
        <Description>Desc</Description>
        <Price>5.00</Price>
    </Product>
    <Product id="L30" manufacturer="B">
        <Description>Desc</Description>
        <Price>5.00</Price>
    </Product>
    </Products>
    XML;
    
    $reader = new XMLReader();
    $reader->open('data://text/plain;base64,' . base64_encode($xml));
    
    // a document to expand to
    $document = new DOMDocument();
    
    while ($reader->read() && $reader->localName !== 'Product') {
    
    }
    
    while ($reader->nodeType === XMLReader::ELEMENT && $reader->localName === 'Product') {
        $productNode = $reader->expand($document);
        var_dump($productNode->localName);
        var_dump(
            array_map(
                function($node) {
                    return $node->textContent;
                },
                iterator_to_array($productNode->attributes)
            )
        );
    
        // next Product sibling
        $reader->next('Product');
    }
    

    Output:

    string(7) "Product"
    array(2) {
      ["id"]=>
      string(3) "L20"
      ["manufacturer"]=>
      string(1) "A"
    }
    string(7) "Product"
    array(2) {
      ["id"]=>
      string(3) "L30"
      ["manufacturer"]=>
      string(1) "B"
    }