I am trying to convert a table to remove some tags used that I don't want, programmatically. I wrote a recursive function that gets called on the table element, and calls itself on all children, making a list of those children and replacing the current children with these sanitized children. However the nodes don't get added correctly, but instead a string representation of [object HTMLTableSectionElement] or [object Text] ends up in the output.
(For those interested in context: I use thead, tbody, and a couple custom tags for a fancy table, but I also want to be able to press a button to copy the table into wikipedia pages which don't support those tags, so while copying to clipboard I have to remove them or it looks terrible)
This is my function:
function prepHTMLforExport(element){
//recursively goes into an HTML element, preparing it for export by:
//removing thead and tbody statements but leaving their contents in the parent element
//removing floattext tags
//replacing hyperlink a tags with ...
//go through child elements and prep each for export
var newchildren = []
var children = element.childNodes
for (var i=0;i<children.length;i++){
var newkid = prepHTMLforExport(children[i])
//if the new kid is actually a list, concat the lists
if (newkid instanceof Array){
newchildren.concat(newkid)
} else {
newchildren.push(newkid)
}
}
//if to be removed, return a list of children
if (element.tagName in ["floattext", "table-body", "table-head"]){
return newchildren
} else {
//else add children to self and return self
if (newchildren.length > 0) {
element.replaceChildren(newchildren)
}
return element
}
}
It should work on basically any HTML table.
I suspected originally that I was checking the wrong children or using the wrong method to replace them, so I also tried it with var children = element.children
, and I looked if there was a different function for the replacement, like replaceWith, but on closer inspection I discovered that this didn't help. Elements already get converted to the wrong form in the nodelist of their parents while they are being processed, not only after I get them replaced with their sanitized version. To test this, there is this table:
<table>
<th>sample text</th>
<th>some more sample text</th>
</table>
When stepping through the code, if the code is just done preparing the first cell and if now about the recurse into the second cell, if you execute console.log(element)
, this results in
<table>
<th>[object Text]</th>
<th>some more sample text</th>
</table>
Your issue is that .element.replaceChildren
expects each child to be a separate parameter, however you are passing an array.
So .replaceChildren
converts the array to a string, giving
[object Text],[object HTMLTableSectionElement]
the extra ,
comma in the middle is an additional hint that this was an array converted to a string
You can convert an array to parameters using rest parameters - that line of your code becomes:
element.replaceChildren(...newchildren)
Updated snippet:
function prepHTMLforExport(element) {
//recursively goes into an HTML element, preparing it for export by:
//removing thead and tbody statements but leaving their contents in the parent element
//removing floattext tags
//replacing hyperlink a tags with ...
//go through child elements and prep each for export
var newchildren = []
var children = element.childNodes
for (let i = 0; i < children.length; i++) {
var newkid = prepHTMLforExport(children[i])
//if the new kid is actually a list, concat the lists
if (newkid instanceof Array) {
newchildren = newchildren.concat(newkid);
} else {
newchildren.push(newkid)
}
}
//if to be removed, return a list of children
if (element.tagName in ["floattext", "table-body", "table-head"]) {
return newchildren
} else {
//else add children to self and return self
if (newchildren.length > 0) {
element.replaceChildren(...newchildren);
}
return element
}
}
prepHTMLforExport(document.getElementById("tbl"));
<table id='tbl'>
<th>sample text</th>
<th>some more sample text</th>
</table>