Search code examples
performancenfs

Performance of an large directory structure, networked application


I'm trying to find out what the performance of a large directory structure would be if deep directories were to be accessed on a shared, nfs filesystem. The structure would be excessively large, with 4 levels of nested directories, each level containing 1024 directories. (1024 at root, 1024 in a given subdirectory, and so on).

This filesystem would be on a network repository that users would be accessing for their personal information. The data would be replicated on multiple servers and load-balanced, but still, each machine would have a decent load at all times.

If the 4th level contained the information that the users were looking for, how bad would the performance be? If all were accessing different subdirectories? Could this be resolved by caching inode information, or no?

I've been searching on this for a while, but I'm primarily finding information on large files rather than large directory structures.


Solution

  • I did that at my work once. Don't remember the exact numbers offhand, but I think it was 8 levels deep, 10 subdirectories in each level (user id 87654321 maps to directory 8/7/6/5/4/3/2/1/. Turned out that was not such a great idea, started running into problems with filesystem inode number limits, iirc (10^10 = 10000000000 directories, not good). Switched to more subdirectories per level and many less levels; problems went away. Your situation sounds more manageable, but still, check that your filesystem would support the kinds of file and directory counts that you're anticipating.