Search code examples
akkaakka-streamapache-karafopendaylightakka-cluster

Opendaylight bundles in GracePeriod and cluster not coming up


We are using ODL Nitrogen version. When we perform warm start (ie., restart Karaf servers, without deleting "KARAF_HOME/data" folder following bundles are in "GracePeriod" state for a long time, hence other application bundles that are dependent on this are failing. However when we start Karaf in a clean (without data folder) state, all bundles comes up fine.

We also noticed, netty.tcp port 2550 is not getting binded when bundles goes into failure state. Confirmed this port is not being used by other process also.

349 | GracePeriod |  80 | 2.3.3                               | mdsal-eos-binding-adapter
350 | Active      |  80 | 2.3.3                               | mdsal-eos-binding-api
351 | Active      |  80 | 2.3.3                               | mdsal-eos-common-api
352 | Active      |  80 | 2.3.3                               | mdsal-eos-common-spi
376 | GracePeriod |  80 | 2.3.3                               | mdsal-singleton-dom-impl
142 | Active      |  80 | 2.4.20                              | akka-actor
143 | Active      |  80 | 2.4.20                              | akka-cluster
144 | Active      |  80 | 2.4.20                              | akka-osgi
145 | Active      |  80 | 2.4.20                              | akka-persistence
146 | Active      |  80 | 2.4.20                              | akka-protobuf
147 | Active      |  80 | 2.4.20                              | akka-remote
148 | Active      |  80 | 2.4.20                              | akka-slf4j
149 | Active      |  80 | 2.4.20                              | akka-stream
310 | Active      |  80 | 1.6.3                               | org.opendaylight.controller.sal-akka-raft

We also observe following logs rolling continuously and only this message is coming very frequently. It seems that its not allowing any other bundles to co-perform.

2018-07-02 22:52:47,299 | WARN  | saction-25-27'}}  | 298 - org.opendaylight.controller.config-manager - 0.7.3 | DeadlockMonitor$DeadlockMonitorRunnable | ModuleIdentifier{factoryName='binding-broker-impl', instanceName='binding-broker-impl'} did not finish after 84984 ms

2018-07-02 22:52:50,717 | ERROR | rint Extender: 3  | 325 - org.opendaylight.controller.sal-distributed-datastore - 1.6.3 | AbstractDataStore | Shard leaders failed to settle in 90 seconds, giving up

Diag output of Graceperiod bundle

karaf@virtuora>diag 349
mdsal-eos-binding-adapter (349)
-------------------------------
Status: GracePeriod
Blueprint
7/3/18 6:17 PM
Missing dependencies:
(objectClass=org.opendaylight.mdsal.binding.dom.codec.api.BindingNormalizedNodeSerializer) (objectClass=org.opendaylight.mdsal.eos.dom.api.DOMEntityOwnershipService)

karaf@virtuora>diag 376
mdsal-singleton-dom-impl (376)
------------------------------
Status: GracePeriod
Blueprint
7/3/18 6:22 PM
Missing dependencies:
(objectClass=org.opendaylight.mdsal.eos.dom.api.DOMEntityOwnershipService)

Please let us know

  1. why akka is unable to open netty tcp port
  2. why DOMEntityOwnershipService and BindingNormalizedNodeSerializer

Solution

  • You need to set SO_REUSEADDR to enable the port to be directly reused after it is closed. See https://docs.oracle.com/javase/7/docs/api/java/net/StandardSocketOptions.html#SO_REUSEADDR If you do not set this option then the port will stay blocked for a while dependent on the operation system.

    You should also not forcefully kill a process if possible as this does not cleanly shut down the ports.