What are the differences between SHARDED
and REPLICASET
cluster types in the terraform provider cluster types in the terraform provider?
Can I just make a REPLICASET with num_shards = 3
and it would be identical to SHARDED? And vice-versa?
A Replicaset is traditional replication between the members of the set. One is the primary where all data-changing operations are executed. The rest are secondaries which replicate the operations of the primary and may be used to read data. If the primary becomes unavailable another member of the replicaset can be promoted to take this role with only data missing if there was a replicationlag.
A sharded cluster in mongodb distributes the data between its member. For example: ServerA contains all data in range (a-f) for the sharded collection X. (based on the shard key) Server B contains the data in range (g-i) Server C contans the data in range (j-p) [and so on]
Every Shard requires its own server/replicaset. A sharded cluster requires a mongos service which will route requests to the correct server.
Sharding is used, when a lot of write/update or a huge amount of read operations are expected for a collection/database.
MongoDB Documentation about Sharding
MongoDB Documentation about Replicasets
Adressing your question about terraform:
Terraform documentation for num_shards
num_shards -
(Required) Provide this value if you set a cluster_type of SHARDED or GEOSHARDED. Omit this value if you selected a cluster_type of REPLICASET. This API resource accepts 1 through 50, inclusive. This parameter defaults to 1. If you specify a num_shards value of 1 and a cluster_type of SHARDED, Atlas deploys a single-shard sharded cluster. Don't create a sharded cluster with a single shard for production environments. Single-shard sharded clusters don't provide the same benefits as multi-shard configurations.
So no, the options SHARDED and REPLICASET are not interchangeable. If you have no requirement which requires sharding, use a replica set.