Search code examples
liferayosgiliferay-7jstackgogo-shell

Liferay module (OSGi bundle) stays in "Stopping"


Sometimes when I stop my Liferay module (for instance when I put a new version of its JAR in deploy/) the module refuses to stop.

While the module should go to "Resolved" state, it stays in the "Stopping" state forever:

OSGi lifecycle

Usually it is due to a thread not terminated somewhere, or a network connection not properly closed, and it is often a pain to investigate.

My question: How to find out more efficiently what this Liferay module's problem is?

What I tried:

  • In Gogo Shell diag <module id> does not seem to provide any valuable information about why the module refuses to leave the "Stopping" state.
  • jstack outputs thousands of lines, the vast majority of which is outside of the Liferay module in question. If there was a way to show jstack information for only my module that would be wonderful.

Solution

  • First, find the PID of your webapp server:

    ps aux | grep tomcat
    

    Adapt the command if you are running another server than tomcat, or if you have several instances running.

    Then, dump all threads of that server to a file:

    jstack 12345 > jstack.txt
    

    Where 12345 is the PID you found in the first step.

    Then, look at the source code of your bundle, and find the service activator. It typically looks like this:

    package fr.free.nrw;
    
    [import section]
    
    public class ServiceActivator implements BundleActivator {
    
        private ServiceRegistration registration;
    
        @Override
        public void start(BundleContext context) throws Exception {
            registration = context.registerService(
                MyService.class.getName(), new MyServiceImpl(), null);
        }
    
        @Override
        public void stop(BundleContext context) throws Exception {
            registration.unregister();
        }
    }
    

    Take note of:

    • the namespace,
    • the class name,
    • the stopping method name.

    For instance in the example above they are fr.free.nrw, ServiceActivator and stop, and from these three get the full name fr.free.nrw.ServiceActivator.stop.

    Now open jstack.txt and search for the full name. Even though the file is thousands of lines long, there will most probably only be one hit, and that is the problematic thread:

    at org.eclipse.osgi.internal.serviceregistry.ServiceRegistrationImpl.unregister(ServiceRegistrationImpl.java:222)
    at fr.free.nrw.ServiceActivator.stop(ServiceActivator.java:30)
    at org.eclipse.osgi.internal.framework.BundleContextImpl$4.run(BundleContextImpl.java:830)
    at org.eclipse.osgi.internal.framework.BundleContextImpl$4.run(BundleContextImpl.java:1)
    at java.security.AccessController.doPrivileged(Native Method)
    at org.eclipse.osgi.internal.framework.BundleContextImpl.stop(BundleContextImpl.java:823)
    

    In this file, go up to the beginning of the paragraph, which will be something like this:

    "fileinstall-/home/nico/p/liferay/osgi/modules" #37 daemon prio=5 os_prio=0 tid=0x00007f39480e3000 nid=0x384f waiting on condition [0x00007f395d169000]
      java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000000eb8defb8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    

    With this information in hand, you will known what kind of threading problem is going on, and you will be able to solve it with usual Java thread debugging techniques (1 2).