Search code examples
javaregexsedclassloader

Get count of java classes loaded per library


In the AWS re:invent presentation on Lambda performance (highly recommend) on pp.33-34 the author lists the count of classes loaded within each library using the following command:

java -cp my.jar -verbose:class Handler | grep '\[Loaded' | grep '.jar\]' | sed -r 's/\[Loaded \([^A-Z]*\)[\$A-Za-z0-9]*from.*\]/\1/g' | sort | uniq -c | sort

This basically extracts the namespace up to but not including the first capital letter, which is the class name. The output is supposed to look something like this:

143 com.fasterxml.jackson
219 org.apache.http
373 com.google
507 com.amazonaws

However this only works with the Java 8 class loader logs, which have the following format (this example should output java.io):

[Loaded java.io.Serializable from shared objects file]

The class loader logs as of Java 9+ have this different format:

[0.041s][info][class,load] java.io.Serializable source: jrt:/java.base

How does the sed command need to be updated to produce the same output as above?

I've tried the following, but the entire line is extracted in the regex group, not just the class library. I'm also running on a Mac, so I had to add a -r flag and remove some of the escape characters:

java -cp my.jar -verbose:class Handler | grep '[class,load]' | grep '.jar' | sed -r 's/.*\[class,load\] ([^A-Z]*)[$A-Za-z0-9]*source.*/\1/g'

Solution

  • Since the record has fields space separated we can take advantage of cut to get the desired field and then use sed to extract the package substring. The ([a-z.]+)\.[A-Z].* regex looks for lower case letters and dots until the first dot followed by an upper case letter.

    echo "[0.041s][info][class,load] java.io.Serializable source: jrt:/java.base" | cut -d ' ' -f2 | sed -E 's/([a-z.]+)\.[A-Z].*/\1/g'
    

    Result:

    java.io
    

    If a sed only solution is preferred this command will do grep and cut jobs as well:

    echo "[0.041s]..." | sed -nE '/class,load/ s/[^ ]+ ([^ ]+)/\1/ ; s/([a-z.]+)\.[A-Z].*/\1/p'
    

    grep : /class,load/
    cut : s/[^ ]+ ([^ ]+)/\1/
    extract: s/([a-z.]+)\.[A-Z].*/\1/p