I want to build a three nodes Elasticsearch cluster using 7.2 version,but something is unexpected.
I have three virtual machine: 192.168.7.2、192.168.7.3、192.168.7.4, their main config in config/elasticsearch.yml
:
cluster.name: ucas
node.name: node-2
network.host: 192.168.7.2
http.port: 9200
discovery.seed_hosts: ["192.168.7.2", "192.168.7.3", "192.168.7.4"]
cluster.initial_master_nodes: ["node-2", "node-3", "node-4"]
http.cors.enabled: true
http.cors.allow-origin: "*"
cluster.name: ucas
node.name: node-3
network.host: 192.168.7.3
http.port: 9200
discovery.seed_hosts: ["192.168.7.2", "192.168.7.3", "192.168.7.4"]
cluster.initial_master_nodes: ["node-2", "node-3", "node-4"]
cluster.name: ucas
node.name: node-4
network.host: 192.168.7.4
http.port: 9200
discovery.seed_hosts: ["192.168.7.2", "192.168.7.3", "192.168.7.4"]
cluster.initial_master_nodes: ["node-2", "node-3", "node-4"]
When I start each node,create a index named movie with 3 shards and 0 replicas, then write some docs to the index,the cluster looks normal:
PUT moive
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 0
}
}
PUT moive/_doc/3
{
"title":"title 3"
}
then, set movie
replica to 1:
PUT moive/_settings
{
"number_of_replicas": 1
}
Everything goes on well,but when I set movie
replica to 2:
PUT moive/_settings
{
"number_of_replicas": 2
}
New replicas cannot be assigned to node2.
I do not know which step is not correct, please help and talk about it.
First find the reason why the shard can not be assigned using explain command:
GET _cluster/allocation/explain?pretty
{
"index" : "moive",
"shard" : 2,
"primary" : false,
"current_state" : "unassigned",
"unassigned_info" : {
"reason" : "NODE_LEFT",
"at" : "2019-07-19T06:47:29.704Z",
"details" : "node_left [tIm8GrisRya8jl_n9lc3MQ]",
"last_allocation_status" : "no_attempt"
},
"can_allocate" : "no",
"allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
"node_allocation_decisions" : [
{
"node_id" : "kQ0Noq8LSpyEcVDF1POfJw",
"node_name" : "node-3",
"transport_address" : "192.168.7.3:9300",
"node_attributes" : {
"ml.machine_memory" : "5033172992",
"ml.max_open_jobs" : "20",
"xpack.installed" : "true"
},
"node_decision" : "no",
"store" : {
"matching_sync_id" : true
},
"deciders" : [
{
"decider" : "same_shard",
"decision" : "NO",
"explanation" : "the shard cannot be allocated to the same node on which a copy of the shard already exists [[moive][2], node[kQ0Noq8LSpyEcVDF1POfJw], [R], s[STARTED], a[id=Ul73SPyaTSyGah7Yl3k2zA]]"
}
]
},
{
"node_id" : "mNpqD9WPRrKsyntk2GKHMQ",
"node_name" : "node-4",
"transport_address" : "192.168.7.4:9300",
"node_attributes" : {
"ml.machine_memory" : "5033172992",
"ml.max_open_jobs" : "20",
"xpack.installed" : "true"
},
"node_decision" : "no",
"store" : {
"matching_sync_id" : true
},
"deciders" : [
{
"decider" : "same_shard",
"decision" : "NO",
"explanation" : "the shard cannot be allocated to the same node on which a copy of the shard already exists [[moive][2], node[mNpqD9WPRrKsyntk2GKHMQ], [P], s[STARTED], a[id=yQo1HUqoSdecD-SZyYMYfg]]"
}
]
},
{
"node_id" : "tIm8GrisRya8jl_n9lc3MQ",
"node_name" : "node-2",
"transport_address" : "192.168.7.2:9300",
"node_attributes" : {
"ml.machine_memory" : "5033172992",
"ml.max_open_jobs" : "20",
"xpack.installed" : "true"
},
"node_decision" : "no",
"deciders" : [
{
"decider" : "disk_threshold",
"decision" : "NO",
"explanation" : "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=85%], using more disk space than the maximum allowed [85.0%], actual free: [2.2790256709451573E-4%]"
}
]
}
]
}
We can see that the disk space of node-2 is full:
[vagrant@node2 ~]$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos-root 8.4G 8.0G 480M 95% /
devtmpfs 2.4G 0 2.4G 0% /dev
tmpfs 2.4G 0 2.4G 0% /dev/shm
tmpfs 2.4G 8.4M 2.4G 1% /run
tmpfs 2.4G 0 2.4G 0% /sys/fs/cgroup
/dev/sda1 497M 118M 379M 24% /boot
none 234G 149G 86G 64% /vagrant
Then I clean up the disk space and everything went back to normal: