How can I extract HTML content efficiently with Perl?

I am writing a crawler in Perl, which has to extract contents of web pages that reside on the same server. I am currently using the HTML::Extract module to do the job, but I found the module a bit slow, so I looked into its source code and found out it does not use any connection cache for LWP::UserAgent.

My last resort is to grab HTML::Extract's source code and modify it to use a cache, but I really want to avoid that if I can. Does anyone know any other module that can perform the same job better? I basically just need to grab all the text in the <body> element with the HTML tags removed.

Solution

I use pQuery for my web scraping. But I've also heard good things about Web::Scraper.

Both of these along with other modules have appeared in answers on SO for similar questions to yours:

how can i screen scrape with perl
how can i extract xml of a website and save in a file using perls lwp
how do i extract an html title with perl
can you provide an example of parsing html with your favorite parser
how do I extract content from html file using perl