Search code examples
elasticsearchelastic-stack

Elasticsearch Logging Real-Time Example


I've been working with ELK (Elasticsearch, Logstash, Kibana) recently, and I must admit that some concepts haven't quite clicked with me when I try to apply them in real-world scenarios. Let me explain my use case and the questions that have been bothering me with an example.

Let's say we want to perform log analysis using ELK, collecting logs from multiple servers. For instance, we are collecting logs from servers with IP addresses IP1, IP2, IP3, and IP4 using Logstash (accumulating an average of 10 GB of log data daily on each server). I want to index these logs in Elasticsearch. I've decided to group logs from IP1 and IP2 servers into INDEX1 and logs from IP3 and IP4 servers into INDEX2 because these IP ranges correspond to different applications' logs. Let's assume that Elasticsearch and Kibana are installed on a server with the IP address IP5.

In such a setup, I'm struggling to fully grasp some abstract aspects when it comes to configuring nodes in our Elasticsearch cluster. I'm not quite sure how to structure my node setup. Here are two possible configurations I'm considering:

cluster.name: my-cluster
node.name: master-node-1
node.master: true
node.data: false
network.host: IP5
discovery.seed_hosts: ["IP5"]
cluster.initial_master_nodes: ["master-node-1", "master-node-2", "master-node-3"]
cluster.name: my-cluster
node.name: data-node-1
node.master: false
node.data: true
network.host: IP5
discovery.seed_hosts: ["IP5"]

Or this structure is good for my case

cluster.name: my-cluster
node.name: master-node-1
network.host:IP5
discovery.seed_hosts: ["IP5", "IP1", "IP2", "IP3", "IP4"]
cluster.initial_master_nodes: ["my-node-1"]

I'm wondering if this structure makes sense, or if there's a better way to set up the nodes. If you have any recommendations or if I've misunderstood certain aspects, I would greatly appreciate your insights. Additionally, if you can suggest articles or resources that explain multi-server setups with real-time examples, I would find that very helpful.

Thank you!

I attempted to configure Elasticsearch nodes based on my specific use case. I considered two node configurations:


Solution

  • Your servers sending the logs are not part of the Elasticsearch cluster. So scenario 1 is probably better. But at your size depending on your retention period, seperating master nodes and data nodes is probably not neccessary.

    Instead make a 3 node cluster with all the roles and ingest your data there.