Search code examples
load-balancingqemuceph

does ceph RBD have the ability to load balance?


I don't know much about ceph. As far as I know, RBD is a distributed block storage device of ceph, and the same data should be stored on several computers that make up the ceph cluster. So, does this distributed block device(ceph RBD) have the ability to load balance? In other words, if multiple clients(In my situation,it would be QEMU)use this RBD block storage and they read the same data at the same time, will ceph RBD balance the traffic and send it to the client simultaneously from different computers in the cluster or just one computer will send its data to multiple clients? If I have a ceph cluster composed of 6 computers and a ceph cluster composed of 3 computers. Is there any difference in the performance of these RBD?


Solution

  • It's not a load balance but the distributed nature of ceph allows many clients to be served in parallel. If we focus on replicated pools with a size of 3 there are 3 different disks (on different hosts) storing the exact same object. But there's always a primary OSD which forwards write requests to the other copies. This make write requests a little slower but read requests are only served by the primary OSD so it's much faster than writing. And since clients "talk" directly to the OSDs (they get the address from the MON) many clients can be served in parallel. Especially because the OSDs don't store the RBDs as a single object but split into many objects grouped by "Placement Groups". However, if you really talk about the exact same object being read by multiple clients you have to know that there are watchers on RBDs which lock them so only one client can change data. If you could describe your scenario with more detail we could provide more information.

    If I have a ceph cluster composed of 6 computers and a ceph cluster composed of 3 computers. Is there any difference in the performance of these RBD?

    It depends on the actual configuration (reasonable amount of PGs, crush rules, network etc.) but in general the answer is yes, the more ceph nodes you have the more clients you can serve in parallel. Ceph may not have the best performance compared to other storage systems (of course, depending on the actual setup) but it scales so well that the performance stays the same with an increasing amount of clients.

    https://ceph.readthedocs.io/en/latest/architecture/