Search code examples
phpxmlxsltconverters

Warning: DOMDocument::loadXML(): Start tag expected, '<' not found in Entity


We import products from an .xml file

To import products correctly, we first had to create an .xsl file that would convert the .xml file to our requirements from link URL.

Link to .xml file looks like: https://www.importfilexml.de/restful/export/api/products.xml?acceptedlocales=en_US&output-filetype=xml

When I paste link with tag, example select one brand: https://www.importfilexml.de/restful/export/api/products.xml?acceptedlocales=en_US&output-filetype=xml&tag_1=Love+Moschino

then work correct. But when I paste link to full products catalog: https://www.importfilexml.de/restful/export/api/products.xml?acceptedlocales=en_US&output-filetype=xml

Then during validate convert from .xsl to .xml I get issue:
Warning: DOMDocument::loadXML(): Start tag expected, '<' not found in Entity, line: 1 in /home/usr/domains/mywebsite.pl/public_html/vendor/firebear/importexport/Model/Output/Xslt.php on line 34

code file .xslt.php:

    /**
     * @param $file
     * @param $xsl
     * @return string
     * @throws \Magento\Framework\Exception\LocalizedException
     */
    public function convert($file, $xsl)
    {
        if (!class_exists('\XSLTProcessor')) {
            throw new LocalizedException(__(
                'The XSLTProcessor class could not be found. This means your PHP installation is missing XSL features.'
            ));
        }
        $xmlDoc = new \DOMDocument();

        $xmlDoc->loadXML($file, LIBXML_COMPACT | LIBXML_PARSEHUGE | LIBXML_NOWARNING);

        $xslDoc = new \DOMDocument();
        $xslDoc->loadXML($xsl, LIBXML_COMPACT | LIBXML_PARSEHUGE | LIBXML_NOWARNING);

        $proc = new \XSLTProcessor();
        $proc->registerPHPFunctions();
        $proc->importStylesheet($xslDoc);
        try {
            $newDom = $proc->transformToDoc($xmlDoc);
        } catch (\Exception $e) {
            throw new LocalizedException(__("Error : " . $e->getMessage()));
        }

        return $newDom->saveXML();
    }
}

.xml:

<?xml version="1.0" encoding="UTF-8"?>
<Items>
    <product>
        <sku>CPW88FXXCD_002_L34_32</sku>
        <group>106003</group>
        <product_from_website>brand</product_from_website>
        <url_key>panasonic-Trousers-Men-MW0MW02349-grey-32</url_key>
        <name>panasonic Trousers Men MW0MW02349 grey</name>
        <custom_name>panasonic Trousers Men</custom_name>
        <description>&lt;div class='pdbDescContainer'&gt;&lt;div class='pdbDescSection'&gt;&lt;span class='pdbDescSectionTitle'&gt;Collection:&lt;/span&gt;&lt;span class='pdbDescSectionText'&gt;Spring/Summer&lt;/span&gt;&lt;/div&gt;&lt;div class='pdbDescSection'&gt;&lt;span class='pdbDescSectionTitle'&gt;Gender:&lt;/span&gt;&lt;span class='pdbDescSectionText'&gt;Man&lt;/span&gt;&lt;/div&gt;&lt;div class='pdbDescSection'&gt;&lt;span class='pdbDescSectionTitle'&gt;Type:&lt;/span&gt;&lt;span class='pdbDescSectionText'&gt;Trousers&lt;/span&gt;&lt;/div&gt;&lt;div class='pdbDescSection'&gt;&lt;span class='pdbDescSectionTitle'&gt;Fastening:&lt;/span&gt;&lt;span class='pdbDescSectionText'&gt;&lt;span class='pdbDescSectionList'&gt;&lt;span class='pdbDescSectionItem'&gt;buttons&lt;/span&gt;&lt;span class='pdbDescSectionItem'&gt;zip&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class='pdbDescSection'&gt;&lt;span class='pdbDescSectionTitle'&gt;Pockets:&lt;/span&gt;&lt;span class='pdbDescSectionText'&gt;4&lt;/span&gt;&lt;/div&gt;&lt;div class='pdbDescSection'&gt;&lt;span class='pdbDescSectionTitle'&gt;Material:&lt;/span&gt;&lt;span class='pdbDescSectionText'&gt;&lt;span class='pdbDescSectionList'&gt;&lt;span class='pdbDescSectionItem'&gt;cotton 96%&lt;/span&gt;&lt;span class='pdbDescSectionItem'&gt;elastane 4%&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class='pdbDescSection'&gt;&lt;span class='pdbDescSectionTitle'&gt;Pattern:&lt;/span&gt;&lt;span class='pdbDescSectionText'&gt;checkered&lt;/span&gt;&lt;/div&gt;&lt;div class='pdbDescSection'&gt;&lt;span class='pdbDescSectionTitle'&gt;Washing:&lt;/span&gt;&lt;span class='pdbDescSectionText'&gt;&lt;span class='pdbDescSectionList'&gt;&lt;span class='pdbDescSectionItem'&gt;wash at 30° C&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class='pdbDescSection'&gt;&lt;span class='pdbDescSectionTitle'&gt;Model height, cm:&lt;/span&gt;&lt;span class='pdbDescSectionText'&gt;185&lt;/span&gt;&lt;/div&gt;&lt;div class='pdbDescSection'&gt;&lt;span class='pdbDescSectionTitle'&gt;Model wears a size:&lt;/span&gt;&lt;span class='pdbDescSectionText'&gt;32&lt;/span&gt;&lt;/div&gt;&lt;div class='pdbDescSection'&gt;&lt;span class='pdbDescSectionTitle'&gt;Details:&lt;/span&gt;&lt;span class='pdbDescSectionText'&gt;&lt;span class='pdbDescSectionList'&gt;&lt;span class='pdbDescSectionItem'&gt;visible logo&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;</description>
        <qty>3</qty>
        <price>88.50</price>
        <special_price>44.50</special_price>
        <weight />
        <color>grey</color>
        <gender />
        <ean>8719255365841</ean>
        <brand>panasonic</brand>
        <length />
        <size>32</size>
        <categories>Clothing/Trousers/Men</categories>
        <product_online>1</product_online>
        <group>106003</group>
        <product_websites>base</product_websites>
        <attribute_set_code>Default</attribute_set_code>
        <product_type>simple</product_type>
        <image>https://www.importwebsite.com/prod/stock_product_image_106003_2086033795.jpg</image>
        <additional_images>https://www.importwebsite.com/prod/stock_product_image_106003_2086033795.jpg,https://www.importwebsite.com/prod/stock_product_image_106003_343223477.jpg,https://www.importwebsite.com/prod/stock_product_image_106003_287457799.jpg,https://www.importwebsite.com/prod/stock_product_image_106003_570760537.jpg</additional_images>
    </product>

Solution

  • I think the error is not in the XSLT but simply in your use of the PHP DOMDocument API, it has two methods, one called load you should use if you have a file name or file path or URI to the XML or XSLT you want to load, and another called loadXML you should use if you have a string with XML or XSLT code you want to parse.

    The error you get suggests you use loadXML but don't pass in XML or XSLT code but the file name or path or URI of the XML or XSLT code. For that you should use the load method.

    See http://sandbox.onlinephpfunctions.com/code/f080d3aedcc93d591018902724b7846eb063d36b which demonstrates that $doc->loadXML('foo.xml') generates the error DOMDocument::loadXML(): Start tag expected, '&lt;' not found in Entity while $doc->loadXML('<root>test</root>'); would work fine. So change your loadXML calls to load calls in the PHP code.