Search code examples
javamemory-leaksvisualvm

Hunting memory leaks, VisualVM: "No GC root found". What's next?


I have a memory dump which I has made from a dying application. It has consumed all available heap (-Xmx1024m). It uses com.gargoylesoftware.htmlunit.WebClient to crawl web pages. Makes a few http requests per minute, dies in several days. As I see from the dump, it has ~1750 instances of HtmlPage class, each is with tones of related objects, including full content of a crawled page.

I cannot understand why the HtmlPage are not garbage collected. I have investigated instance references and I don't see any my code holding a reference to it, and VisualVM says that "No GC root found". As I understand it should mean the object is eligible for gc, but it doesn't work.

The application is running as a simple standalone process, it doesn't use any web containers or application servers.

Any hints? What else should I look into?

Specs:

  • htmlunit v2.7
  • java version "1.6.0_13" Java(TM) SE Runtime Environment (build 1.6.0_13-b03) Java HotSpot(TM) Server VM (build 11.3-b02, mixed mode)
  • Linux my.lan 2.6.18-128.el5 #1 SMP Wed Dec 17 11:42:39 EST 2008 i686 i686 i386 GNU/Linux

Update1

I have tried to analyse the dump by the YourKit Java Profiler. It shows me a lot of java.lang.ref.Finalizer objects with 310mb retained size. They are created for the net.sourceforge.htmlunit.corejs.javascript.NativeGenerator#finalize() finalizer, and the NativeGenerator refers to Window, then to HtmlPage and to everything.

Does anybody know why are they stay in memory?

Note: Curious, but VisualVM showed "pending finalization" as zero.


Solution

  • Make sure you're calling webClient.closeAllWindows() after you're done with page(s) - otherwise JavaScript thread is continuing to run holding references to the page resources etc.