weird massive read in ElasticSearch Data node while indexing data

Currently I'm indexing massive data into ElasticSearch node. At the beginning, it was quite fast, then roughly after 1 day, it turned extremely slowly. I went to one of the data nodes, typed iotop:

Total DISK READ: 30.62 M/s | Total DISK WRITE: 1258.86 K/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                                                                                                       
 3951 be/4 elastics 1580.54 K/s    0.00 B/s  0.00 % 99.99 % java -Xms16g -Xmx16g -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseC~t.path.conf=/etc/elasticsearch org.elasticsearch.bootstrap.Elasticsearch
 3943 be/4 elastics    2.93 M/s    2.39 K/s  0.00 % 99.62 % java -Xms16g -Xmx16g -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseC~t.path.conf=/etc/elasticsearch org.elasticsearch.bootstrap.Elasticsearch
 3950 be/4 elastics 1434.83 K/s    0.00 B/s  0.00 % 99.42 % java -Xms16g -Xmx16g -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseC~t.path.conf=/etc/elasticsearch org.elasticsearch.bootstrap.Elasticsearch
 3941 be/4 elastics    2.48 M/s   46.98 K/s  0.00 % 99.11 % java -Xms16g -Xmx16g -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseC~t.path.conf=/etc/elasticsearch org.elasticsearch.bootstrap.Elasticsearch
 3939 be/4 elastics    3.86 M/s    7.96 K/s  0.00 % 99.01 % java -Xms16g -Xmx16g -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseC~t.path.conf=/etc/elasticsearch org.elasticsearch.bootstrap.Elasticsearch
 3944 be/4 elastics    3.25 M/s    9.55 K/s  0.00 % 98.91 % java -Xms16g -Xmx16g -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseC~t.path.conf=/etc/elasticsearch org.elasticsearch.bootstrap.Elasticsearch
 3942 be/4 elastics    3.41 M/s   82.81 K/s  0.00 % 98.13 % java -Xms16g -Xmx16g -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseC~t.path.conf=/etc/elasticsearch org.elasticsearch.bootstrap.Elasticsearch
 3945 be/4 elastics    3.49 M/s   81.22 K/s  0.00 % 97.77 % java -Xms16g -Xmx16g -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseC~t.path.conf=/etc/elasticsearch org.elasticsearch.bootstrap.Elasticsearch
 3940 be/4 elastics    3.06 M/s   15.92 K/s  0.00 % 97.58 % java -Xms16g -Xmx16g -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseC~t.path.conf=/etc/elasticsearch org.elasticsearch.bootstrap.Elasticsearch
 3938 be/4 elastics    3.13 M/s  121.83 K/s  0.00 % 96.40 % java -Xms16g -Xmx16g -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseC~t.path.conf=/etc/elasticsearch org.elasticsearch.bootstrap.Elasticsearch
 3953 be/4 elastics 1567.80 K/s   44.59 K/s  0.00 % 83.50 % java -Xms16g -Xmx16g -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseC~t.path.conf=/etc/elasticsearch org.elasticsearch.bootstrap.Elasticsearch
 3952 be/4 elastics  542.24 K/s  667.25 K/s  0.00 % 71.46 % java -Xms16g -Xmx16g -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseC~t.path.conf=/etc/elasticsearch org.elasticsearch.bootstrap.Elasticsearch
    1 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % init

So I'm wondering why the disk read is so high while the write is so low (I'm running 200 processes in parallel and every 2 or 3 seconds the process is gonna do a bulk index operation with size 800).

PS: each box has 32G memory and is of m3.2xlarge.

Any ideas?

Thanks!

Solution

One likely cause of the massive disk reads is the act of merging that Elasticsearch does of segments. During this merging the contents of older segments are read and combined into a single, larger segment.

You can read more about merging in the Elasticsearch guide here: http://www.elastic.co/guide/en/elasticsearch/guide/current/merge-process.html

And for performance considerations around merging I would recommend taking a look at this blog post: http://www.elastic.co/guide/en/elasticsearch/guide/current/indexing-performance.html#segments-and-merging