Search code examples
javahtmlparsinghtml-content-extraction

What HTML parsing libraries do you recommend in Java


I want to parse some HTML in order to find the values of some attributes/tags etc.

What HTML parsers do you recommend? Any pros and cons?


Solution

  • NekoHTML, TagSoup, and JTidy will allow you to parse HTML and then process with XML tools, like XPath.