GlusterFS high CPU usage on read load

I have a GlusterFS setup with two nodes(node1 and node2) setup to a replicated volume.

The volume contains many small files, 8kb - 200kb in size. When I subject node1 to heavy read load, glusterfsd and glusterfs processed together uses ~ 100% CPU on both nodes.

There is no write load on any of the nodes. But why is the CPU load so high, on both nodes?

As I understand it all the data is replicated to both nodes, so it "should" perform like a local filesystem.

Solution

this is commonly related to small files, e.g. if you have PHP apps running from a gluster volume.

This one bit me in the rear once, and it mostly has to do that in many php frameworks, you get a lot of stats to see if a file exists at that spot, if not, it will state a level (directory) higher, or with a slightly different name. Repeat 1000 times. Per file.

Now here's the catch: that lookup if the file exists does not just happen on that node / the local brick. (if you use replication), but on ALL the nodes / bricks involved. The cost involved can explode fast. (specially on some cloud platforms, where IOPS are capped)

This article helped me out significantly. In the end there was still a small penalty, but the benefits outweighed that.

https://www.vanderzee.org/linux/article-170626-141044/article-171031-113239/article-171212-095104