Search code examples
tidbtikv

Can we run multiple TiDB instances connected to the same cluster to PD and (hence TiKV)?


I want to set up a local cluster of TiDB for the benchmark. Here are some my doubts:

  • Can multiple TiDB instances connect to the same PD and TiKV cluster? (We only notice a single TiDB instance in this official production deployment doc.)
  • If positive, will transactions submitted to different TiDB instances satisfy snapshot isolation level?
  • At the storage layer, does each TiKV node keep the entire dataset? (The replication factor is equal to the TiKV node number?)
  • If negative, how to configure the replication factor?

Solution

  • Can multiple TiDB instances connect to the same PD and TiKV cluster?

    Yes, you can add as many tidb-servers as you want to fulfill your needs.

    If positive, will transactions submitted to different TiDB instances satisfy snapshot isolation level?

    Yes, TiDB is a distributed database which provides snapshot isolation by default. And different transactions from different tidb-servers can also satisfy the snapshot isolation level. TiDB uses the Percolator transaction model to implement the distributed transaction. For more implementation details, you can refer to this article: https://pingcap.com/blog/2016-11-17-mvcc-in-tikv/

    At the storage layer, does each TiKV node keep the entire dataset? (The replication factor is equal to the TiKV node number?)

    No. TiDB internally shards table into small range-based chunks that we refer to as "regions". Each region defaults to approximately 100MiB in size. The replication factor is default to 3. Each tikv-server in the cluster holds hundreds of thousands of regions.

    If negative, how to configure the replication factor?

    PD reads the configuration file (conf/pd.yml) and uses the max-replicas configuration in it. For more detail, you can refer to https://github.com/pingcap/docs/blob/master/FAQ.md#is-the-number-of-replicas-in-each-region-configurable-if-yes-how-to-configure-it