Search code examples
javahtmlgroovypretty-print

How to pretty-print Html 5 files using java or groovy?


I know I can use some classes that I can use to pretty-print xml like this one in Groovy, but since Html 5 is not necessarily a well-formed xml this will not work.

Are there libraries in java or groovy that will pretty-print html5?

These are valid html 5 files and need not be cleaned, just pretty-printed.


Solution

  • You can use the examples as demonstrated by Mr Haki, but to clean up the HTML, you can use something like Jsoup or Neko. An example using Neko can be like this:

    import org.cyberneko.html.parsers.SAXParser
    def url = 'http://java.sun.com'
    def html = new XmlSlurper(new SAXParser()).parse(url)
    

    This balances the XML elements, and after that, you can easily pretty print it using XmlUtil, for example.