Search code examples
wildflymod-cluster

Wildfly 9 - mod_cluster on TCP


We are currently testing to move from Wildfly 8.2.0 to Wildfly 9.0.0.CR1 (or CR2 built from snapshot). The system is a cluster using mod_cluster and is running on VPS what in fact prevents it from using multicast.

On 8.2.0 we have been using the following configuration of the modcluster that works well:

      <mod-cluster-config proxy-list="1.2.3.4:10001,1.2.3.5:10001" advertise="false" connector="ajp">
          <dynamic-load-provider>
              <load-metric type="cpu"/>
          </dynamic-load-provider>
      </mod-cluster-config>

Unfortunately, on 9.0.0 proxy-list was deprecated and the start of the server will finish with an error. There is a terrible lack of documentation, however after a couple of tries I have discovered that proxy-list was replaced with proxies that are a list of outbound-socket-bindings. Hence, the configuration looks like the following:

      <mod-cluster-config proxies="mc-prox1 mc-prox2" advertise="false" connector="ajp">
          <dynamic-load-provider>
              <load-metric type="cpu"/>
          </dynamic-load-provider>
      </mod-cluster-config>

And the following should be added into the appropriate socket-binding-group (full-ha in my case):

    <outbound-socket-binding name="mc-prox1">
        <remote-destination host="1.2.3.4" port="10001"/>
    </outbound-socket-binding>
    <outbound-socket-binding name="mc-prox2">
        <remote-destination host="1.2.3.5" port="10001"/>
    </outbound-socket-binding>

So far so good. After this, the httpd cluster starts registering the nodes. However I am getting errors from load balancer. When I look into /mod_cluster-manager, I see a couple of Node REMOVED lines and there are also many many errors like:

ERROR [org.jboss.modcluster] (UndertowEventHandlerAdapter - 1) MODCLUSTER000042: Error MEM sending STATUS command to node1/1.2.3.4:10001, configuration will be reset: MEM: Can't read node

In the log of mod_cluster there are the equivalent warnings:

manager_handler STATUS error: MEM: Can't read node

As far as I understand, the problem is that although wildfly/modcluster is able to connect to httpd/mod_cluster, it does not work the other way. Unfortunately, even after an extensive effort I am stuck.

Could someone help with setting mod_cluster for Wildfly 9.0.0 without advertising? Thanks a lot.


Solution

  • After a couple of weeks I got back to the problem and found the solution. The problem was - of course - in configuration and had nothing in common with the particular version of Wildfly. Mode specifically:

    There were three nodes in the domain and three servers in each node. All nodes were launched with the following property:

    -Djboss.node.name=nodeX
    

    ...where nodeX is the name of a particular node. However, it meant that all three servers in the node get the same name, which is exactly what confused the load balancer. As soon as I have removed this property, everything started to work.