I can't understand how does solr handle high availability in solrCloud. In its reference guide it pointed out that it uses CDCR to handle HA. But I think that this is an expensive strategy.
Can anyone tell what it actually handle HA and why is it an optimum way? Thanks a lot.
There are a few levels of HA - you need to ask yourself, what kinds of failures can I tolerate? Things like:
SolrCloud's basic cluster setup provides you the tools to cover #1-3 pretty easily. Add replicas, distribute them correctly among racks.
You can get #4, or even #5, using a single SolrCloud cluster spread around multiple data centers (Multi-AZ in AWS for #4, or Multi-Region in AWS for #5), but a single SolrCloud cluster doesn't have any locality awareness, so you need to understand that intra-cluster communication will often be cross-data-center, so the data centers really need to be low-latency between each other or your query latency will suffer badly.
SolrCloud's CDCR is a way to connect two or more independent SolrCloud clusters, and essentially create master/slave relationships between clusters. This gives you #4 or #5 without the penalty of cross-cluster traffic latency.