I'm trying to convert some html tags to custom tags using PHP. I've been trying to use DOMDocument but finding it to be incredibly cumbersome. Is there a simple way to do this in PHP / DOMDocument?
Input:
<div class="element_wrapper">
<div class="element_header">My header</div>
<div class="element">
<div class="name">Element Name</div>
</div>
</div>
Desired Output:
<element_wrapper>
<element_header>My Header</element_header>
<element>
<name>Element Name</name>
</element>
</element_wrapper>
My first approach (incomplete, added per AndrewL64's request):
<?php
$templates = Repository::fetchTemplates();
$classes = [
'element_wrapper',
'element',
'name',
'element_header',
];
foreach ($templates as $template) {
$html = '<div>' . $template['html_body'] . '</div>';
$dom = new DOMDocument();
$dom->loadHTML($html);
$finder = new DOMXPath($dom);
foreach ($classes as $class) {
$div_nodes = $finder->query("//div[@class='$class']");
/** @var DOMNode $div_node */
foreach ($div_nodes as $div_node) {
/** @var DOMElement $custom_tag */
$custom_tag = $dom->createElement($class, $div_node->nodeValue);
if ($div_node->hasAttributes()) {
foreach ($div_node->attributes as $attribute) {
if ($attribute->nodeValue === $class) {
continue;
}
$custom_tag->setAttributeNode($attribute);
}
}
$div_node->parentNode->replaceChild($custom_tag, $div_node);
}
}
}
Many thanks in advance!
In the end I used preg_replace
and multiple DOMDocument
instances to make the changes to the html. Using purely DOMDocument there is a mess of recursion and rebuilding that you need to do which is hard to keep track of and feels awfully error prone. My solution follows:
<?php
$templates = TemplateRepository::fetchAll();
$classes = [
'element_wrapper',
'element',
'name',
'element_header',
];
foreach ($templates as $template) {
// We need to guarantee a root element for DOMDocument to be happy. (strip later)
$html = '<div>' . $template['html_body'] . '</div>';
$dom = new DOMDocument();
$dom->loadHTML($html);
$finder = new DOMXPath($dom);
$class_found = false; // track if we found a class / will have changes.
foreach ($classes as $class) {
$div_nodes = $finder->query("//div[contains(@class,'$class')]");
/** @var DOMNode $div_node */
foreach ($div_nodes as $div_node) {
$class_found = true;
$content = $dom->saveHTML($div_node);
// I know that the class I want to turn into a custom tag will come after the div opener, so replace that with the class.
$content = preg_replace('@^<div class="' . $class . '([^>]+)>@', '<' . $class . ' class="\1>', $content);
// Clean up empty class attribute...just cuz.
$content = preg_replace("@<$class class=\"\s*\"@", "<$class", $content);
// Replace closing div with closing custom tag. We can assume the end </div> is our target because DOMDocument did the heavy lifting.
$content = preg_replace('@</div>$@', "</$class>", $content);
// Create a new dom document from our new html string. We need this to create a DOMNode that we can import into our original.
$dom_element = new DOMDocument();
$dom_element->loadHTML($content);
// We only want the original html, so just grab the first child of the body.
$node = $dom_element->getElementsByTagName('body')[0]->firstChild;
// Import the new node into our original document so we can use it to replace our <div> version.
$node = $dom->importNode($node, true);
// Replace our original.
$div_node->parentNode->replaceChild($node, $div_node);
}
}
// Get the final updated html.
$new_body = $dom->saveHTML($dom->getElementsByTagName('body')[0]->firstChild);
// And finish by stripping off our wrapper div we added at the start.
$new_body = preg_replace('@^<div>(.*)</div>@', '\1', $new_body);
}