I just installed ElasticSearch and have not loaded in any data at all. htop
shows ElasticSearch running a bunch of threads.
Why is ElasticSearch running all these processes? What is it doing?
I know that you can configure htop
to group all of the threads into just one line. But that still does not answer the question of why anything is running at all.
The simple answer is that in order to be efficient ES uses many threadpools for carrying out the many things it needs to do.
As you probably know ES provides a very powerful search engine. So in order to enable a potentially massive amount of users to run a potentially massive amount of queries efficiently, ES uses a pool of threads to carry out that work.
That's not the end of the story. While all those users might search like mads, other users or processes can also index a potentially massive amount of data at the same time. For that reason ES needs another thread pool for handling the many indexing requests it can get. Those indexing requests can come in two forms: indexing a single document, indexing many documents in bulk. For those two indexing processes, ES uses two different thread pools.
That's still not the end of the story. While some users are searching and some others are indexing data, there might be a backup process running (what ES calls snapshotting). For that there's another threadpool.
And so on. The list is not exhaustive, but you can trust that ES has several threadpools in order to handle what it needs to handle and it knows how to do it efficiently as it will only create as many threads as your available processors can handle.
You can review the full list of threadpools that ES is managing and you'll probably better understand what it is doing. You can also use the /_cat/thread_pool
and the /_nodes/hot_threads
endpoints in order to better visualise what those threads are doing.