How to use Html parser in Textpad?

I am not fully new beginner to Java, I learned Java when I was in collage. I am currently doing a small program for grab data from a online webpage. I do the google reasearch, and find html parser is one of simple way to do that.

My question is how to set up classpath, and import html parser libraries in TextPad?

------My Answer -----------------------------------------------

I have found a way to solved this problem. I think I should post it to here, in case, someone else has same problem as me.

I do not know if it is appropriate way to solved this. here it is.

I have found a link http://htmlparser.sourceforge.net/javadoc/doc-files/using.html

I downloaded htmlparser zip file, and unziped lib folder to my c drive. I run this line in CMD.(I am using windows based system.) set CLASSPATH=C:\lib\htmlparser.jar;C:\lib\htmllexer.jar;%CLASSPATH% then it works.

I guess this line is for add your new .jar file to your old classpath. %CLASSPATH% means your old classpath.

Solution

I have done a fair amount of screen scraping and found Java to be too cumbersome. In my experience rather use Groovy to screenscrape the data. You won't need to fiddle with the pesky classpath. As groovy is a dsl for Java and you know Java it will be quite straight forward. You can also use Textpad as an editor.

For example:

def slurper = new XmlSlurper()
def htmlParser = slurper.parse("http://stackoverflow.com")

htmlParser.'**'.findAll{ it.@class == 'question-hyperlink'}.each {
   println it
}

The above code is from a blog post: http://www.maclovin.de/2010/02/robust-html-parsing-the-groovy-way/