I am relatively new to this. So I am trying to understand the relationships among zookeeper
, solrcloud
, and http requests.
My understanding is:
Zookeeper
(accessible through 2181) keeps config files for solrcloud
.
and all http requests goes to solrcloud instance directly rather than going through zookeeper
.
Therefore, zookeeper
, in this particular case, is not used for its ability in routing (API) requests? I do not really think that should be the case. But based on the tutorials from solr
official sites. It seems all the requests needs to go through solr
's 8983 port.
Solr uses Zookeeper to keep its clusterstate (which servers has which cores / shards / parts of the complete collection) as well as configuration files and anything else that should be available all throughout the cluster.
The request itself is made to Solr, and Solr uses information from Zookeeper in the background to route the request internally to the correct location. A client can be Cloud Aware (such as SolrJ) and can query Zookeeper directly by itself and then contact the correct Solr server instantly, instead of having Solr route the request internally. In SolrJ, this is implemented as CloudSolrClient (or CloudSolrServer as it might be named in older versions of SolrJ) (and not the regular SolrServer, which would contact the Solr instance you're referencing and then route the request from there).
If you look at the documentation of CloudSolrClient, you can see that it takes the Zookeeper information as its argument, and not the Solr Server address. SolrJ makes a ZK request to Zookeeper, retrieves the clusterstate, then makes the HTTP request directly to the servers hosting the shard or collection.