We recently decided to enable GC logging for several Apache Storm processes: Nimbus, UI, DRPC server, supervisor, and logviewer. (Storm workers are covered by another question: How to enable GC logging for Apache Storm workers, while preventing log file overwrites and capping disk space usage.) We are doing this on a number of clusters (exact version varies) as a aid to looking into Storm-related memory and garbage collection problems. While doing this, we want to avoid two problems we know might happen:
When Java GC logging starts for a process it seems to replace the content of any file that has the same name. This means that unless you are careful, you will lose the GC logging, perhaps when you are more likely to need it.
If you keep the cluster running long enough, log files will fill up disk unless managed. Even if GC logging is not currently voluminous we want to manage the risk of an unusual situation arising that causes the logging rate to suddenly spike up.
I have solutions for these except that the overwrite prevention will not work for all versions of Java. I welcome suggestions.
You'll need to set some JVM parameters when starting the servers. You can set JVM options for Storm processes through the following storm.yaml properties (if you are managing Storm through Apache Ambari, look under Storm service > configs > advanced storm-site > worker.childopts):
Now lets discuss the JVM parameters to include in those.
To enable GC logging to a file, you will need to add -verbose:gc -Xloggc:<log-file-location>
.
You need to give the log file name special consideration to prevent overwrites. Obviously you will want to have different GC log file names for the different kinds of Storm processes. It also seems like you need to have a unique name for every invocation and adding a timestamp seems to be the best solution. In some versions of Java you can put a "%t" in the Xloggc value and it will be replaced by the timestamp for when the process started, formatted as <YYYY>-<MM>-<DD>_<HH>-<MM>-<SS>
. However, in other versions of Java the "%t" will be included in the file name literally.
So far the JVM options are (using Nimbus as an example) -verbose:gc -Xloggc:/var/log/storm/storm-nimbus-gc.log-%t
(the path should match your Storm logging directory, you can name the log file differently if you prefer, and the "-%t" should be omitted where Java doesn't support).
Now onto managing use of disk space. I'll be happy if there is a simpler way that what I have.
First, take advantage of Java's built-in GC log file rotation. -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10M
is an example of enabling this rotation, having up to 10 GC log files from the JVM, each of which is no more than approx 10MB in size. 10 x 10MB is 100MB max usage.
With the GC log file rotation in place with up to 10 files, '.0', '.1', ... '.9' will be added to the file name you gave in Xloggc
. .0 will be first and after it reaches .9 it will replace .0 and continue on in a round robin manner. In some versions of Java '.current' will be additionally put on the end of the name of the log file currently being written to.
Due to the unique file naming we apparently have to have to avoid overwrites, you can have 100MB per process invocation, so this is not a total solution to managing disk space used by the Storm GC logs. You will end up with a set of up to 10 GC log files on each server invocation -- this can add up over time. The best solution (under *nix) to that would seem to be to use the logrotate utility (or some other utility) to periodically clean up the GC logs that have not been modified in the last N days.
Be sure to do the math and make sure you will have enough disk space. Note that your machines will likely have multiple processes writing GC logs at same time.
People frequently want more details and context in their GC logs than the default, so consider adding in -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps
.
All together, you will be adding something like this to the properties:
-verbose:gc -Xloggc:/var/log/storm/storm-nimbus-gc.log-%t -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10M -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps
.-verbose:gc -Xloggc:/var/log/storm/storm-supervisor-gc.log-%t -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10M -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps
.-verbose:gc -Xloggc:/var/log/storm/storm-ui-gc.log-%t -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10M -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps
.-verbose:gc -Xloggc:/var/log/storm/storm-drpc-gc.log-%t -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10M -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps
.-verbose:gc -Xloggc:/var/log/storm/storm-logviewer-gc.log-%t -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10M -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps
.When doing this, check if some of these JVM parameters are already present. Also remember to take take of clearing off old GC logs with logrotate or something.
The GC logging will take affect upon process restart.