Search code examples
ibm-mobilefirst

IBM Worklight 6.2. Analytics topology. Master and data Nodes


I'm reading about production topology for the Analytics part of Worklight 6.2.

https://www-01.ibm.com/support/knowledgecenter/api/content/SSZH4A_6.2.0/com.ibm.worklight.monitor.doc/monitor/t_setting_up_production_cluster.html

It explains that nodes can act both as Master Node or as Data Node or only as one of them.

My question is why we should configure dedicated nodes, Master OR Data instead of configuring all the nodes for both Master AND Data.

I assume the the node (only one) acting as master will provide worst performance in its Data role but on the other hand the configuration will be simpler and the high availability will be higher.

Thank you.


Solution

  • Your assumption is correct.

    A master node is responsible for handling communication between the data nodes. The data nodes will be responsible for indexing data. Having dedicated master and data nodes will allow them to focus their processing time and memory on their specific tasks. However, as you mentioned, in some cases its not worth doing this to complicate the configuration.

    Another reason is that its not necessary to put a master node on a high performing machine. You can reserve the better machines for the data nodes.

    The analytics console uses Elasticsearch under the covers. It would be worth looking up the benefits and drawbacks of choosing master and data nodes in Elasticsearch since it is an open source library and there are several resources available for it.

    Edit:

    As you can imagine, there is no one size fits all configuration. The configuration depends on several factors such as:

    • How long you wish to keep data stored
    • How many machines you have to dedicate to analytics
    • How verbose your client logs have been set
    • Your preferences between availability and performance

    In my personal tests, I typically keep each node as a data and master node. Its possible that in the future we will document how the different configurations affect performance.