Search code examples
javamultithreadingperformancejakarta-eehp-ux

java one thread slow


I have a J2EE java application which processes SOAP requests. In our production environment (HPUX,OC4J,Java 5) we have about 20 threads running for this process, and we sometimes see 1 thread pausing for ~15 seconds. Until now, I haven't succeeded replicating the problem in our preproduction environment, and I'm scared of breaking stuff and violating SLA's if I use jconsole and associated tools on our production server.

Who has any inspiration? I know about http://java.sun.com/j2se/1.5/pdf/jdk50_ts_guide.pdf but I miss the experience to dare using it straight in production (plus, the HPUX guys threw some of these tools out of the toolbox, replacing them with HPJMeter)

Also, although this suggests a GC problem to me, I don't yet know enough to prove or disprove this theory and I am open to other suggestions.


Solution

  • We connect jconsole (and other tools) straight to production regularly. There is no significant overhead for us, the instrumentation is already going on within the JVM so you'd just be connecting a remote process to read published values. I say go for it!

    Either way, you really need to see what's going on on the box. Thread dumps might or do some internal instrumentation. By internal instrumentation, I mean recording key measures within the code and exposing those somehow. It's essentially what the JVM does (exposing them via JMX) but rolling your own gives you more specificity. For example, I'm frequently recording request/response or other critical path performance timings internally.

    oh, and one more thing. You can setup your app to using an agent to provide even more information. Typically this would be to plug a profiler in (like jprofiler or yourkit) but this does usually add more overhead and isn't recommended for production.

    It's also worth thinking about the cost of not getting the information you need out of the VM. For example, is the cost of not fixing the issue more or less than the cost of a small % drop of performance when monitoring?

    More scientifically, this article has some comments. It's suggesting up to 7% overhead (contradicting my previous point), a previous article from 2006 suggests 3-4% but both are highly contextual results. For example, CPU intensive applications may or may not be affected more than IO bound ones.

    So a more appropriate answer from me (rather than just "go for it") would be to understand the impact it would have for your application in your environment through measurement. Run representative tests on a similar environment to production with jconsole connected and disconnected and see for your self.

    Also see this stackoverflow question.