In the AWS re:invent presentation on Lambda performance (highly recommend) on pp.33-34 the author lists the count of classes loaded within each library using the following command:
java -cp my.jar -verbose:class Handler | grep '\[Loaded' | grep '.jar\]' | sed -r 's/\[Loaded \([^A-Z]*\)[\$A-Za-z0-9]*from.*\]/\1/g' | sort | uniq -c | sort
This basically extracts the namespace up to but not including the first capital letter, which is the class name. The output is supposed to look something like this:
143 com.fasterxml.jackson
219 org.apache.http
373 com.google
507 com.amazonaws
However this only works with the Java 8 class loader logs, which have the following format (this example should output java.io
):
[Loaded java.io.Serializable from shared objects file]
The class loader logs as of Java 9+ have this different format:
[0.041s][info][class,load] java.io.Serializable source: jrt:/java.base
How does the sed
command need to be updated to produce the same output as above?
I've tried the following, but the entire line is extracted in the regex group, not just the class library. I'm also running on a Mac, so I had to add a -r
flag and remove some of the escape characters:
java -cp my.jar -verbose:class Handler | grep '[class,load]' | grep '.jar' | sed -r 's/.*\[class,load\] ([^A-Z]*)[$A-Za-z0-9]*source.*/\1/g'
Since the record has fields space separated we can take advantage of cut
to get the desired field and then use sed
to extract the package substring. The ([a-z.]+)\.[A-Z].*
regex looks for lower case letters and dots until the first dot followed by an upper case letter.
echo "[0.041s][info][class,load] java.io.Serializable source: jrt:/java.base" | cut -d ' ' -f2 | sed -E 's/([a-z.]+)\.[A-Z].*/\1/g'
Result:
java.io
If a sed
only solution is preferred this command will do grep
and cut
jobs as well:
echo "[0.041s]..." | sed -nE '/class,load/ s/[^ ]+ ([^ ]+)/\1/ ; s/([a-z.]+)\.[A-Z].*/\1/p'
grep : /class,load/
cut : s/[^ ]+ ([^ ]+)/\1/
extract: s/([a-z.]+)\.[A-Z].*/\1/p