Search code examples
javarubymemory-leaksjrubyclassloader

Loading JRuby at runtime and ClassLoader leak


I'm trying to load JRuby dynamically at runtime (so I can execute Ruby code using arbitrary JRuby installations and versions). My plan is roughly to create a ClassLoader that has access to jruby.jar, then use that to load the necessary JRuby runtime etc. All was well until I needed to do this multiple times. If I destroy the first JRuby runtime, the third or fourth will cause an OutOfMemory: PermGen space.

I've reduced this to a minimal example. The example uses both the "direct" API as well as the JRuby Embed API. The "direct" API section is commented out, but both exhibit the same behavior: after a few iterations, PermGen is out of memory. (tested with JRuby 1.6.7 and JRuby 1.6.5.1)

import java.lang.reflect.Method;
import java.net.URL;
import java.net.URLClassLoader;

import org.junit.Test;

public class JRubyInstantiationTeardownTest {

    @Test
    public void test() throws Exception {
        for (int i = 0; i < 100; ++i) {
            URL[] urls = new URL[] {
                    new URL("file://path/to/jruby-1.6.7.jar")
            };
            ClassLoader cl = new URLClassLoader(urls, this.getClass().getClassLoader());

            // "Direct" API
            /*
            Class<?> klass = cl.loadClass("org.jruby.Ruby");
            Method newInstance = klass.getMethod("newInstance");
            Method evalScriptlet = klass.getMethod("evalScriptlet", String.class);
            Method tearDown = klass.getMethod("tearDown");

            Object runtime = newInstance.invoke(null);
            System.out.println("have " + runtime);
            evalScriptlet.invoke(runtime, "puts 'hello, world'");
            tearDown.invoke(runtime);
            */

            // JRuby Embed API
            Class<?> scriptingContainerClass = cl.loadClass("org.jruby.embed.ScriptingContainer");
            Method terminate = scriptingContainerClass.getMethod("terminate");
            Method runScriptlet = scriptingContainerClass.getMethod("runScriptlet", String.class);

            Object container = scriptingContainerClass.newInstance();
            System.out.println("have " + container);
            runScriptlet.invoke(container, "puts 'hello, world'");
            terminate.invoke(container);
        }
    }

}

Questions: is this a reasonable thing to try to do with a ClassLoader? If so, is this a bug in JRuby, or am I doing something wrong with my class loading?

Bonus: if this were a bug in JRuby, how might something like Eclipse Memory Analysis tool help find the source? I can open a heap dump and see several Ruby objects (where I'd expect no more than one at any given time), but I'm not sure how to find why these aren't being garbage collected...


Solution

  • Edit: reported this as a bug: JRUBY-6522, now fixed.

    After digging around in the Eclipse Memory Analyzer, I clicked "path to GC" on one of the URLClassLoader instances. It was referenced by org.jruby.RubyEncoding$2 which was referenced by java.lang.ThreadLocal$ThreadLocalMap$Entry.

    Looking inside that source file, I see a static ThreadLocal variable being created: RubyEncoding.java:266. ThreadLocals are presumably hanging around forever, referencing my ClassLoader and leaking memory.

    This code example succeeds:

    import java.lang.reflect.Method;
    import java.net.URL;
    import java.net.URLClassLoader;
    import java.util.concurrent.Callable;
    import java.util.concurrent.ExecutorService;
    import java.util.concurrent.Executors;
    
    import org.junit.Test;
    
    public class JRubyInstantiationTeardownTest {
    
        public static int i;
    
        @Test
        public void test() throws Exception {
    
            for (i = 0; i < 100; ++i) {
    
                URL[] urls = new URL[] {
                    new URL("file:///home/pat/jruby-1.6.7/lib/jruby.jar")
                };
    
                final ClassLoader cl = new URLClassLoader(urls, this.getClass().getClassLoader());
    
                final Class<?> rubyClass = cl.loadClass("org.jruby.Ruby");
                final Method newInstance = rubyClass.getMethod("newInstance");
                final Method evalScriptlet = rubyClass.getMethod("evalScriptlet", String.class);
                final Method tearDown = rubyClass.getMethod("tearDown");
    
                // "Direct" API
                Callable<Void> direct = new Callable<Void>() {
                    public Void call() throws Exception {
                        // created inside thread because initialization happens immediately
                        final Object ruby = newInstance.invoke(null);
    
                        System.out.println("" + i + ": " + ruby);
                        evalScriptlet.invoke(ruby, "puts 'hello, world'");
                        tearDown.invoke(ruby);
                        return null;
                    }
                };
    
                // JRuby Embed API
                final Class<?> scriptingContainerClass = cl.loadClass("org.jruby.embed.ScriptingContainer");
                final Method terminate = scriptingContainerClass.getMethod("terminate");
                final Method runScriptlet = scriptingContainerClass.getMethod("runScriptlet", String.class);
    
                // created outside thread because ruby instance not created immediately
                final Object container = scriptingContainerClass.newInstance();
    
                Callable<Void> embed = new Callable<Void>() {
                    public Void call() throws Exception {
    
                        System.out.println(i + ": " + container);
                        runScriptlet.invoke(container, "puts 'hello, world'");
                        terminate.invoke(container);
                        return null;
                    }
                };
    
                // separate thread for each loop iteration so its ThreadLocal vars are discarded
                final ExecutorService executor = Executors.newSingleThreadExecutor();
                executor.submit(direct).get();
                executor.submit(embed).get();
                executor.shutdown();
            }
        }
    
    }
    

    Now I'm wondering if this is acceptable behavior of JRuby, or what JRuby-Rack does in the context of a servlet container where the servlet container is managing its own thread pool to process requests. It seems like one would need to maintain a completely separate thread pool, only execute Ruby code in those threads, and then ensure they get destroyed when the servlet is undeployed...

    This is very relevant: Tomcat Memory Leak Protection

    See also JVM bug report: Provide reclaimable thread local values without Thread termination