Search code examples
javaclassmd5bytecode

What is the variance of java .class files across different compilers, versions, dependencies?


Hi I was wondering how much Java class files change across different compilers. So how much do the actual bytes change if a .java files is compiled by say a Sun JDK 1.4, 1.5 1.6 or even IBM JDK. I know that class files can be different with regards to debug information and obfuscation, but let's assume for the question that those options are the same, so debug information included, no obfuscation. If I ran a MD5 or SHA-1 has on a .class file that was compiled by JDK 1.4 would the Hash be different if I compiled it in JDK 1.5 but targeting 1.4 what when targeting JDK 1.5?

Also related to that, does a binary of a class file change when different dependencies are used, or asked differently can the binary of a class file change based on it's dependencies ?

And last but not least are there programmatic ways to analyse the metadata of a .class file in order to identify compiler version and or switches that were used when compiling it ?


Solution

  • The Java compilers have quite some freedom when creating classes and bytecode from source. They can reorder the methods, reorder the constant pool (with class names, method names and strings - this results in different method byte code, too) and reorder the actual byte code commands, as long as the result when executing them is the same.

    So, using MD5 or similar hashes to prove that two class files came from the same source is not really sensible.