Search code examples
androidhttpclientapache-commons-loggingandroidhttpclientjarjar

How to repackage HttpClient 4.3.1 and remove dependencies on commons-logging?


I want to repackage apache's httpclient lib to ship it with an android app (like https://code.google.com/p/httpclientandroidlib/ but with HttpClient 4.3.1)

Therefore, I downloaded the httpclient 4.3.1 jar (includes all its dependencies) by hand and used jarjar to repackage it:

x@x$: cd libs && for f in *.jar; do java -jar ../jarjar-1.4.jar process ../rules.txt $f out/my-$f; done

with rules.txt:

 rule org.apache.http.** my.repackaged.org.apache.http.@1

Then I used ant to put the output together:

<project name="MyProject" default="merge" basedir=".">
  <target name="merge">
        <zip destfile="my-org-apache-httpclient-4.3.1.jar">
            <zipgroupfileset dir="libs/out" includes="*.jar"/>
        </zip>
  </target>
</project>

I can use that file to develop and test my app, but if I deploy it on android, it throws an exception s/th like that it cannot find my.repackaged.org.apache.logging.log4j.something referenced by my.package.org.apache.logging.whatEver.

So, now I want to strip out any dependency on commons-logging by using bytecode manipulation. This has been done before: http://sixlegs.com/blog/java/dependency-killer.html

But I wonder how I actually do it? There are only dependencies on org.apache.commons.logging.Log:

x$x$: java -jar jarjar-1.4.jar find jar my-org-apache-httpclient-4.3.1.jar commons-logging-1.1.3.jar
my/http/impl/execchain/ServiceUnavailableRetryExec -> org/apache/commons/logging/Log
my/http/impl/execchain/RetryExec -> org/apache/commons/logging/Log
my/http/impl/execchain/RedirectExec -> org/apache/commons/logging/Log
my/http/impl/execchain/ProtocolExec -> org/apache/commons/logging/Log
...

I think the way to go is, to remove these dependencies and replace it with an own implementation like he did here https://code.google.com/p/httpclientandroidlib/ . Therefore, I made a new maven project with only one class with provided scope for the commons-logging that implements org.apache.commons.logging.Log interface and just delefates to the android.utils.Log:

MyLog implements org.apache.commons.logging.Log {}

in the package my.log and I packaged that in my-log-1.0.0.jar. I put that jar into the same folder as the repackaged httpclient-jars and used ant as mentioned above to package all together in my-org-apache-httpclient-4.3.1.jar.


Approach 1

I tried to use jarjar again:

java -jar jarjar-1.4.jar process rules2.txt my-org-apache-httpclient-4.3.1.jar my-org-apache-httpclient-4.3.1-without-logging-dep.jar

with rules2.txt:

rule my.repackaged.commons.logging.** my.log.@1

but that does not work. The exception that it cannot find my.repackaged.org.apache.logging.log4j.something referenced by my.package.org.apache.logging.whatEver is still thrown.


Approach 2

I also tried to delete the logging stuff from the final jar and/or repackage the my.repackaged.org.apache.log4j and logging to its original packages:

rules2.txt v2:

rule my.repackaged.org.apache.log4j.** org.apache.log4j.@1
rule my.repackaged.org.apache.logging.** org.apache.logging.@1

but that also is still throwing the excpetion: my.repackaged.org.apache.logging.log4j.something referenced by my.package.org.apache.logging.whatEver


QUESTION

How can I kill/replace that commons-logging dependencies and get rid of the Exception?


Solution

  • Introduction

    If a program depends on a library it usually means that it uses methods of the library. Removing a dependency is therefore not a simple task. You effectively want to take away code that is - at least formally - required by the program.

    There are three ways of removing dependencies:

    1. Adapt the source code to not depend on the library and compile it from scratch.
    2. Modify the bytecode to remove references to the library the project depends on.
    3. Manipulate the runtime to not require the dependency. The easiest way is to recreate the required classes and to put them into the jar file.

    None of these ways are really pretty. All of them can require a lot of work. None are guaranteed to work without side effects.

    Solution

    I will describe my solution by presenting the files and steps I used to solve the problem. To reproduce, you will need the following files (in a single directory):

    lib/xxx-v.v.v.jar: The library jars (httpclient and dependencies, excluding commons-logging-1.1.3.jar)
    jarjar-1.4.jar: Used for repackaging the jars
    rules.txt: The jarjar rules

    rule org.apache.http.** my.http.@1
    rule org.apache.commons.logging.** my.logging.@1
    

    build.xml: Ant build configuration

    <project name="MyProject" basedir=".">
        <target name="logimpl">
            <javac srcdir="java/src" destdir="java/bin" target="1.5" />
            <jar jarfile="out/logimpl.jar" basedir="java/bin" />
        </target>
        <target name="merge">
            <zip destfile="httpclient-4.3.1.jar">
                <zipgroupfileset dir="out" includes="*.jar"/>
            </zip>
        </target>
    </project>
    

    java/src/Log.java

    package my.logging;
    
    public interface Log {
        public boolean isDebugEnabled();
        public void debug(Object message);
        public void debug(Object message, Throwable t);
    
        public boolean isInfoEnabled();
        public void info(Object message);
        public void info(Object message, Throwable t);
    
        public boolean isWarnEnabled();
        public void warn(Object message);
        public void warn(Object message, Throwable t);
    
        public boolean isErrorEnabled();
        public void error(Object message);
        public void error(Object message, Throwable t);
    
        public boolean isFatalEnabled();
        public void fatal(Object message);
        public void fatal(Object message, Throwable t);
    }
    

    java/src/LogFactory.java

    package my.logging;
    
    public class LogFactory {
    
        private static Log log;
    
        public static Log getLog(Class<?> clazz) {
            return getLog(clazz.getName());
        }
    
        public static Log getLog(String name) {
            if(log == null) {
                log = new Log() {
                    public boolean isWarnEnabled() { return false; }
                    public boolean isInfoEnabled() { return false; }
                    public boolean isFatalEnabled() { return false; }
                    public boolean isErrorEnabled() {return false; }
                    public boolean isDebugEnabled() { return false; }
                    public void warn(Object message, Throwable t) {}
                    public void warn(Object message) {}
                    public void info(Object message, Throwable t) {}
                    public void info(Object message) {}
                    public void fatal(Object message, Throwable t) {}
                    public void fatal(Object message) {}
                    public void error(Object message, Throwable t) {}
                    public void error(Object message) {}
                    public void debug(Object message, Throwable t) {}
                    public void debug(Object message) {}
                };
            }
            return log;
        }
    
    }
    

    do_everything.sh

    #!/bin/sh
    
    # Repackage library
    mkdir -p out
    for jf in lib/*.jar; do
        java -jar jarjar-1.4.jar process rules.txt $jf `echo $jf | sed 's/lib\//out\//'`
    done
    
    # Compile logging implementation
    mkdir -p java/bin
    ant logimpl
    
    # Merge jar files
    ant merge
    

    That's it. Open up a console and execute

    cd my_directory && ./do_everything.sh
    

    This will create a folder "out" containing single jar files and "httpclient-4.3.1.jar" which is the final, independent and working jar file. So, what did we just do?

    1. Repackaged httpclient (now in my.http)
    2. Modified the library to use my.logging instead of org.apache.commons.logging
    3. Compiled required classes to be used by the library (my.logging.Log and my.logging.LogFactory).
    4. Merged the repackaged libraries and the compiled classes into a single jar file, httpclient-4.3.1.jar.

    Pretty simple, isn't it? Just read the shell script line by line to discover the single steps. To check whether all dependencies were removed you can run

    java -jar jarjar-1.4.jar find class httpclient-4.3.1.jar commons-logging-1.1.3.jar
    

    I tried the generated jar file with SE7 and Android 4.4, it worked in both cases (see below for remarks).

    Class file version

    Every class file has a major version and a minor version (both depend on the compiler). The Android SDK requires class files to have a major version less than 0x33 (so everything pre 1.7 / JDK 7). I added the target="1.5" attribute to the ant javac task so the generated class files have a major version of 0x31 and can therefore be included in your Android app.


    Alternative (bytecode manipulation)

    You're lucky. Logging is (almost always) a one-way operation. It barely causes side effects affecting the main program. That means that removing commons-logging should be possible as it won't affect the functionality of the program.

    I chose the second way, bytecode manipulation, which you suggested in your question. The concept is basically just this (A is httpclient, B is commons-logging):

    1. If the return type of a method of A is part of B, the return type will be changed to java.lang.Object.
    2. If any argument of a method of A has a type that is part of B, the argument type will be changed to java.lang.Object.
    3. Invocations of methods belonging to B are removed entirely. pop and constant instructions are inserted to repair the VM stack.
    4. Types belonging to B are removed from descriptors of methods called from A. This requires the target class (the class containing the called method) to be processed. All object types belonging to B will be replaced with java.lang.Object.
    5. Instructions that attempt to access fields of classes belonging to B are removed. pop and constant instructions are inserted to repair the VM stack.
    6. If a method tries to access a field of a type that belongs to B, the field signature referenced by the instruction is changed to java.lang.Object. This requires the target class (the class containing the accessed field) to be processed.
    7. Fields of a type contained in B but belonging to classes of A are modified so that their type is java.lang.Object.

    As you can see, the idea behind this is to replace all referenced classes with java.lang.Object and to remove all accesses to class members belonging to commons-logging.

    I don't know whether this is reliable and I did not test the library after applying the manipulator. But from what I saw (the disassembled class files and no VM errors while loading the class files) I am fairly sure the code works.

    I tried to document almost everything the program does. It uses the ASM Tree API which provides rather simple access to the class file structure. And - to avoid unnecessary negative reviews - this is "quick 'n' dirty" code. I did not really test it a lot and I bet there are faster ways of bytecode manipulation. But this program seems to fulfill the OP's needs and that's all I wrote it for.

    import java.io.File;
    import java.io.FileOutputStream;
    import java.io.IOException;
    import java.io.InputStream;
    import java.io.OutputStream;
    import java.util.Enumeration;
    import java.util.List;
    import java.util.jar.JarEntry;
    import java.util.jar.JarFile;
    import java.util.jar.JarOutputStream;
    
    import org.objectweb.asm.ClassReader;
    import org.objectweb.asm.ClassWriter;
    import org.objectweb.asm.Opcodes;
    import org.objectweb.asm.Type;
    import org.objectweb.asm.tree.AbstractInsnNode;
    import org.objectweb.asm.tree.ClassNode;
    import org.objectweb.asm.tree.FieldInsnNode;
    import org.objectweb.asm.tree.FieldNode;
    import org.objectweb.asm.tree.InsnList;
    import org.objectweb.asm.tree.InsnNode;
    import org.objectweb.asm.tree.MethodInsnNode;
    import org.objectweb.asm.tree.MethodNode;
    
    
    public class DependencyFinder {
    
       public static void main(String[] args) throws IOException {
          if(args.length < 2) return;
    
          DependencyFinder df = new DependencyFinder();
          df.analyze(new File(args[0]), new File(args[1]), "org.apache.http/.*", "org.apache.commons.logging..*");
       }
    
       @SuppressWarnings("unchecked")
       public void analyze(File inputFile, File outputFile, String sClassRegex, String dpClassRegex) throws IOException {
          JarFile inJar = new JarFile(inputFile);
          JarOutputStream outJar = new JarOutputStream(new FileOutputStream(outputFile));
    
          for(Enumeration<JarEntry> entries = inJar.entries(); entries.hasMoreElements();) {
             JarEntry inEntry = entries.nextElement();
             InputStream inStream = inJar.getInputStream(inEntry);
    
             JarEntry outEntry = new JarEntry(inEntry.getName());
             outEntry.setTime(inEntry.getTime());
             outJar.putNextEntry(outEntry);
             OutputStream outStream = outJar;
    
             // Only process class files, copy all other resources
             if(inEntry.getName().endsWith(".class")) {
                // Initialize class reader and writer
                ClassReader classReader = new ClassReader(inStream);
                ClassWriter classWriter = new ClassWriter(0);
                String className = classReader.getClassName();
    
                // Check whether to process this class
                if(className.matches(sClassRegex)) {
                   System.out.println("Processing " + className);
                   // Parse entire class
                   ClassNode classNode = new ClassNode(Opcodes.ASM4);
                   classReader.accept(classNode, 0);
    
                   // Check super class and interfaces
                   String superClassName = classNode.superName;
                   if(superClassName.matches(dpClassRegex)) {
                      throw new RuntimeException(className + " extends " + superClassName);
                   }
                   for(String iface : (List<String>) classNode.interfaces) {
                      if(iface.matches(dpClassRegex)) {
                         throw new RuntimeException(className + " implements " + superClassName);         
                      }
                   }
    
                   // Process methods
                   for(MethodNode method : (List<MethodNode>) classNode.methods) {
                      Type methodDesc = Type.getMethodType(method.desc);
                      boolean changed = false;
                      // Change return type if necessary
                      Type retType = methodDesc.getReturnType();
                      if(retType.getClassName().matches(dpClassRegex)) {
                         retType = Type.getObjectType("java/lang/Object");
                         changed = true;
                      }
                      // Change argument types if necessary
                      Type[] argTypes = methodDesc.getArgumentTypes();
                      for(int i = 0; i < argTypes.length; i++) {
                         if(argTypes[i].getClassName().matches(dpClassRegex)) {
                            argTypes[i] = Type.getObjectType("java/lang/Object");
                            changed = true;
                         }
                      }
                      if(changed) {
                         // Update method descriptor
                         System.out.print("Changing " + method.name + methodDesc);
                         methodDesc = Type.getMethodType(retType, argTypes);
                         method.desc = methodDesc.getDescriptor();
                         System.out.println(" to " + methodDesc);
                      }
                      // Remove method invocations
                      InsnList insns = method.instructions;
                      for(int i = 0; i < insns.size(); i++) {
                         AbstractInsnNode insn = insns.get(i);
                         // Ignore all other nodes
                         if(insn instanceof MethodInsnNode) {
                            MethodInsnNode mnode = (MethodInsnNode) insn;
                            Type[] cArgTypes = Type.getArgumentTypes(mnode.desc);
                            Type cRetType = Type.getReturnType(mnode.desc);
    
                            if(mnode.owner.matches(dpClassRegex)) {
                               // The method belongs to one of the classes we want to get rid of
                               System.out.println("Removing method call " + mnode.owner + "." +
                                     mnode.name + " in " + method.name);
                               boolean isStatic = (mnode.getOpcode() == Opcodes.INVOKESTATIC);
                               if(!isStatic) {
                                  // pop instance
                                  insns.insertBefore(insn, new InsnNode(Opcodes.POP));
                               }
                               for(int j = 0; j < cArgTypes.length; j++) {
                                  // pop argument on stack
                                  insns.insertBefore(insn, new InsnNode(Opcodes.POP));
                               }
                               // Insert a constant value to repair the stack
                               if(cRetType.getSort() != Type.VOID) {
                                  InsnNode valueInsn = getValueInstruction(cRetType);
                                  insns.insertBefore(insn, valueInsn);
                               }
                               // Remove the actual method call
                               insns.remove(insn);
                               // Go back one instruction to not skip the next one
                               i--;
                            } else {
                               changed = false;
                               if(cRetType.getClassName().matches(dpClassRegex)) {
                                  // Change return type
                                  cRetType = Type.getObjectType("java/lang/Object");
                                  changed = true;
                               }
                               for(int j = 0; j < cArgTypes.length; j++) {
                                  if(cArgTypes[j].getClassName().matches(dpClassRegex)) {
                                     // Change argument type
                                     cArgTypes[j] = Type.getObjectType("java/lang/Object");
                                     changed = true;
                                  }
                               }
                               if(changed) {
                                  // Update method invocation
                                  System.out.println("Patching method call " + mnode.owner + "." +
                                        mnode.name + " in " + method.name);
                                  mnode.desc = Type.getMethodDescriptor(cRetType, cArgTypes);
                               }
                            }
                         } else if(insn instanceof FieldInsnNode) {
                            // Yeah I lied... we must not ignore all other instructions
                            FieldInsnNode fnode = (FieldInsnNode) insn;
                            Type fieldType = Type.getType(fnode.desc);
                            if(fnode.owner.matches(dpClassRegex)) {
                               System.out.println("Removing field access to " + fnode.owner + "." +
                                     fnode.name + " in " + method.name);
                               // Patch code
                               switch(fnode.getOpcode()) {
                               case Opcodes.PUTFIELD:
                               case Opcodes.GETFIELD:
                                  // Pop instance
                                  insns.insertBefore(insn, new InsnNode(Opcodes.POP));
                                  if(fnode.getOpcode() == Opcodes.PUTFIELD) break;
                               case Opcodes.GETSTATIC:
                                  // Repair stack
                                  insns.insertBefore(insn, getValueInstruction(fieldType));
                                  break;
                               default:
                                  throw new RuntimeException("Invalid opcode");
                               }
                               // Remove instruction
                               insns.remove(fnode);
                               i--;
                            } else {
                               if(fieldType.getClassName().matches(dpClassRegex)) {
                                  // Change field type
                                  System.out.println("Patching field access to " + fnode.owner +
                                        "." + fnode.name + " in " + method.name);
                                  fieldType = Type.getObjectType("java/lang/Object");
                               }
                               // Update field type
                               fnode.desc = fieldType.getDescriptor();
                            }
                         }
                      }
                   }
                   // Process fields
                   for(FieldNode field : (List<FieldNode>) classNode.fields) {
                      Type fieldType = Type.getType(field.desc);
                      if(fieldType.getClassName().matches(dpClassRegex)) {
                         System.out.print("Changing " + fieldType.getClassName() + " " + field.name);
                         fieldType = Type.getObjectType("java/lang/Object");
                         field.desc = fieldType.getDescriptor();
                         System.out.println(" to " + fieldType.getClassName());
                      }
                   }
                   // Class processed
                   classNode.accept(classWriter);
                } else {
                   // Nothing changed
                   classReader.accept(classWriter, 0);
                }
                // Write class to JAR entry
                byte[] bClass = classWriter.toByteArray();
                outStream.write(bClass);
             } else {
                // Copy file
                byte[] buffer = new byte[1024 * 64];
                int read;
                while((read = inStream.read(buffer)) != -1) {
                   outStream.write(buffer, 0, read);
                }
             }
    
             outJar.closeEntry();
          }
          outJar.flush();
          outJar.close();
          inJar.close();
       }
    
       InsnNode getValueInstruction(Type type) {
          switch(type.getSort()) {
          case Type.INT:
          case Type.BOOLEAN:
             return new InsnNode(Opcodes.ICONST_0);
          case Type.LONG:
             return new InsnNode(Opcodes.LCONST_0);
          case Type.OBJECT:
          case Type.ARRAY:
             return new InsnNode(Opcodes.ACONST_NULL);
          default:
             // I am lazy, I did not implement all types
             throw new RuntimeException("Type not implemented: " + type);
          }
       }
    
    }