I'm outputting html that's all crushed together, and would like to convert it to have proper indentation. I've been trying to use xmllint for this, but with no joy. E.g. when this is in file.html:
<table><tr><td><b>Foo</b></td></tr></table>
<table><tr><td>Bar</td></tr></table>
I get:
$ xmllint --format file.html
file.html:2: parser error : Extra content at the end of the document
<table><tr><td>Bar</td></tr></table>
^
<<< exit status [1] >>>
But when file.html contains either of those lines alone, it works fine (removing the second line):
$ xmllint --format file.html
<?xml version="1.0"?>
<table>
<tr>
<td>
<b>Foo</b>
</td>
</tr>
</table>
When i inlcude the --html
option, it's more likely to run without errors, but then it doesn't indent.
Any suggestions? Are there any other (*nix) tools I can use for this? Thanks ...
I think this is because the HTML you have supplied doesn't have a root tag, thus making it an invalid XML.
Try adding the body tag and run xmllint again on it.
<body><table><tr><td><b>Foo</b></td></tr></table>
<table><tr><td>Bar</td></tr></table></body>