I have setup a three node insecure cluster for testing in local machine. I created a database and added a table with few records. I queried the Zone configuration and it showed that num_replicas is 3 and range has replicas as {1, 2, 3}
root@:26257/foo> show zone configuration for database foo;
target | raw_config_sql
----------------+-------------------------------------------
RANGE default | ALTER RANGE default CONFIGURE ZONE USING
| range_min_bytes = 134217728,
| range_max_bytes = 536870912,
| gc.ttlseconds = 90000,
| num_replicas = 3,
| constraints = '[]',
| lease_preferences = '[]'
(1 row)
Time: 2ms total (execution 2ms / network 0ms)
root@:26257/foo> show ranges from database foo;
table_name | start_key | end_key | range_id | range_size_mb | lease_holder | lease_holder_locality | replicas | replica_localities
-------------+-----------+---------+----------+---------------+--------------+-----------------------+----------+---------------------
bar | NULL | NULL | 36 | 0.000105 | 2 | | {1,2,3} | {"","",""}
(1 row)
Then I altered the num_replicas to 5 with below query. Now the number of replicas is more than number nodes available in cluster and I didn't get any error.
root@:26257/foo> ALTER RANGE default CONFIGURE ZONE USING num_replicas = 5, gc.ttlseconds = 100000;
CONFIGURE ZONE 1
Time: 174ms total (execution 174ms / network 0ms)
root@:26257/foo> show zone configuration for database foo;
target | raw_config_sql
----------------+-------------------------------------------
RANGE default | ALTER RANGE default CONFIGURE ZONE USING
| range_min_bytes = 134217728,
| range_max_bytes = 536870912,
| gc.ttlseconds = 100000,
| num_replicas = 5,
| constraints = '[]',
| lease_preferences = '[]'
(1 row)
Then I added a node to the cluster and expected the repliacs for the range to grow. It didn't get replicated but got rebalanced to additional node {1, 2, 4}.
cockroach node ls --insecure
id
------
1
2
3
4
From SQl console
root@:26257/foo> show ranges from database foo;
table_name | start_key | end_key | range_id | range_size_mb | lease_holder | lease_holder_locality | replicas | replica_localities
-------------+-----------+---------+----------+---------------+--------------+-----------------------+----------+---------------------
bar | NULL | NULL | 36 | 0.000105 | 2 | | {1,2,4} | {"","",""}
(1 row)
As per document, replicas column should list the nodes with replicas for this range. With num_replicas set as 5, shouldn't this column show all 4 nodes? Did I get anything wrong in my understanding or queries?
Although it's not clear in the CockroachDB docs, a cluster will apply a replication factor only once there are a matching number of nodes or more. So after setting the replication factor to 5 for database foo
and adding a fourth node, the cluster might rebalance replicas to that new node if it makes sense, but it won't increase the number of replicas to 5 until there's a fifth node.