Search code examples
ceph

Is it safe to run Ceph with 2-way replication on 3 OSD nodes?


Let say I want to achieve maximum useable capacity with data resilience on this 3 OSD nodes setup where each node contains 2x 1TB OSDs.

Is it safe run 3 Ceph nodes with 2-way replication?

What are the pros and cons of using 2-way? Will it cause data split-brain?

Last but not least, what domain fault tolerance will it be running on 2-way replication?

Thanks!


Solution

  • Sometimes, even three replica is not enough, e.g. if ssd disks (from cache) fail together or one by one.

    http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-October/005672.html

    For two osd you can even set manually 1 replica for minimum and 2 replicas for maximum (I didn't managed to set it automatically in the case of one failed osd of all three osds):

    osd pool default size = 2 # Write an object 2 times

    osd pool default min size = 1 # Allow writing 1 copy in a degraded state

    But this command: ceph osd pool set mypoolname set min_size 1 sets it for a pool, not just the default settings.

    For n = 4 nodes each with 1 osd and 1 mon and settings of replica min_size 1 and size 4 three osd can fail, only one mon can fail (the monitor quorum means more than half will survive). 4 + 1 number of monitors is required for two failed monitors (at least one should be external without osd). For 8 monitors (four external monitors) three mon can fail, so even three nodes each with 1 osd and 1 mon can fail. I am not sure that setting of 8 monitors is possible.

    Thus, for three nodes each with one monitor and osd the only reasonable settings are replica min_size 2 and size 3 or 2. Only one node can fail. If you have an external monitors, if you set min_size to 1 (this is very dangerous) and size to 2 or 1 the 2 nodes can be down. But with one replica (no copy, only original data) you can loose your job very soon.