WSO2:AM: populating organization changes without Hazelcast

We are running WSO2-AM 2.6 multi tenant cluster that has two kinds of nodes

Full profile node (publisher, store, KM, etc..)
Gateway worker nodes

Sharing information b/w publisher and gateways is done using EFS.

So far we were working with Hazelcast enabled, but we prefer to have Hazelcast disabled as it is giving us a lot of pain in production, and we understand that in WSO2 2.x it is not mandatory to have it enabled.

We tested our system with the following setting:

<clustering class="org.wso2.carbon.core.clustering.hazelcast.HazelcastClusteringAgent" enable="false">

Everything was running ok, except for one side effect that we noticed: that it takes a long time (can be even 15 minutes) until deactivation or re-activation of tenant is populated to the worker node.

When creating totally new organization with a newly created API, it is possible to run the API almost instantly at the worker. But if you disable the organization, the API will still run. It will take a long time until worker will report that the tenant is no longer active.

Same for re-activating a tenant. It will take a lot of time until worker will stop complaining about inactive organization and allow running the API.

Is there a configuration setup we need to change? Or is this expected behavior? Who should report to workers about organization changes in the absence of Hazelcast?

Solution

There is a tenant cache[1] which contains tenant information. The default TTL of the cache (and any cache) is 15 minutes. When you deactivate a tenant, this distributed cache is cleared using hazelcast. That is why you observe above when you disable hazelcast clustering.

Typically, in a production environment, it's very unlikely that you needing to activate and deactivate tenants very frequently. So I don't think 15min delay is a concerning problem.

However, if it really is, you have to keep Hazelcast clustering enabled. When you said you faced a lot of pain due to Hazelcast, I believe that's because of the distributed nature of these caches. As a solution, you may enable local cache as opposed to the distributed cache. Here, Hazelcast clustering is used only for the cache invalidation calls. That might work for you. (Disclaimer: I haven't tried this yet.)

For this, you need to set ForceLocalCache to true in carbon.xml

<Cache>
    <!-- Default cache timeout in minutes -->
    <DefaultCacheTimeout>15</DefaultCacheTimeout>
    <!-- Force all caches to act as local -->
    <ForceLocalCache>true</ForceLocalCache>
</Cache>

[1] https://github.com/wso2/carbon-kernel/blob/4.4.x/core/org.wso2.carbon.user.core/src/main/java/org/wso2/carbon/user/core/tenant/JDBCTenantManager.java#L303