I'd like to develop some kind of reverse debugger for Java(where you can step back during execution). To do this, I have to know the initial state of the JVM(which can be easily got by a core dump). Then I have to intercept every memory access the JVM is performing so that I can have a timeline of what the JVM has been doing during execution, so that I can reconstruct every state of the JVM.
So, what I need is a way to intercept the memory accesses but with a low performance overhead, which means that the solution shouldn't add more than 200-300% overhead to the JVM execution, which is already a lot.
Some ideas which come to my mind:
- using ptrace, but it is really slow
- developing some kind of simple virtual machine in which I run the JVM (on top of the guest OS), and this virtual machine intercepts all the memory accesses of the JVM executable, this would be similar to VMware's Replay debugger feature. The problem is that I don't know how to do this or if it is possible at all?
Effectively, you want to monitor changes of Java objects. Tracking memory changes at levels lower than the JVM is an option. Maximum precision could be achieved using
For snapshotting, you could use
ptrace
for process suspension and gaining access to process memoryfork
-based asynchronous snapshots using custom code / core dumps (taking advantage of memory copy-on-write, the main process does not have to be suspended)
The downside of that option is that you'd also be forced to track writes that are unrelated to the Java heap itself (JVM internals, garbage collection, monitors, libraries, ...). Writes affecting the Java heap represent a subset of all writes taking place in the process at any given time. Also, it'd be less straightforward to extract the actual Java objects from those process snapshots/dumps without actual JVM code.
In terms of monitoring changes at the JVM level, a more favorable strategy, maximum precision could be achieved using
For snapshotting, you could use
IterateThroughHeap
and/or FollowReferences
HotSpotDiagnosticMXBean mxbean = ManagementFactory.newPlatformMXBeanProxy(
ManagementFactory.getPlatformMBeanServer(),
"com.sun.management:type=HotSpotDiagnostic",
HotSpotDiagnosticMXBean.class);
mxbean.dumpHeap("dump.hprof", true);
The "right" approach depends on desired performance characteristics, target platform, portability (can it rely on a specific JVM implementation/version), and precision/resolution (snapshots/sampling [aggregating writes] vs. instrumentation [recording each individual write]).
In terms of performance, doing the monitoring at the JVM level tends to be more efficient as only the actual Java heap writes have to be taken into account. Integrating your monitoring solution into the VM and taking advantage of the GC write barrier could be a low-overhead solution, but would also be the least portable one (tied to a specific JVM implementation/version).
If you need to record each individual write, you have to go the instrumentation route and it will most likely turn out to have a significant runtime overhead. You cannot aggregate writes, so there's no optimization potential.
In terms of sampling/snapshotting, implementing a JVMTI agent could be a good compromise. It provides high portability (works with many JVMs) and high flexibility (the iteration and processing can be tailored to your needs, as opposed to relying on standard HPROF heap dumps).