Search code examples
javascreen-scrapinghtmlunit

How to update content in html using htmlunit?


I found it very difficult to work with htmlunit in terms of creating new html content on the fly like we can do in jquery.

For example given a text node:

I am text

I want change that text node into (if the word is greater than 3 chars it is replaced with span):

I am <span>text</span>

After this I want to replace the original text node ( I am text) with

I am <span>text</span>

in the html document wherever it occurred.

So how can I achieve this using htmlunit? Is there better alternative to htmlunit in Java applications for screen scraping or modify dom on the fly type of applications?

In htmlunit I could not even find how to construct a new element as constructors are mostly missing or declared protected.


Solution

  • It's not clear what you want to do exactly, but HtmlUnit is a programmatic browser. Its API allows doing in Java what a user would do with his keyboard and mouse in a standard browser. And modifying the DOM of a web page is not what a user does with his browser.

    Its API allows accessing the DOM tree anyway (though not via the W3C DOM interfaces), and you should thus be able to do in Java what you would do in JavaScript with the DOM. HtmlElement instances can be created through the createElement method of HtmlPage. But of course, there is no "JQuery in Java for HtmlUnit".