Please consider the following:
I am storing around 1.2 Million TIF files ranging from 40 KB to 120 KB in size.
These documents are stored on a windows server with NTFS file-system.
The documents are stored using the following variables:
See below:
C:\<client_id>\<doc_type_id>\image001\1.TIF
Example
C:\1\3\image001\1.TiF
It is a PHP hosted system.
The performance is acceptable at this stage. I want to know what the best strategy is going forward. Considering that the customers and document amounts are going to increase dramatically.
I am looking at replacing the complete storage with Jackrabbit CMS.
Would this be the way? Or
Is storing the documents in a format like:
Example
C:\1\1\167\2\453257\image001\image.TIF
going to be just as efficient?
Please take all other considerations of CMS vs File-system out of the picture. e.g versioning, data backup.
Thanks.
Your question is very similar to this one. Is your load primarily reading your images or writing? If it's read scalability you need, the post describes memcached, which is probably all you need. jackrabbit has loads more features, but is more for hierarchical text storage. Not sure it will do any better performance wise on your images. Also, if you do choose jackrabbit, make sure your content hierarchy is deep enough for jackrabbit to stay efficient. Any parent with 10,000 or more children is going to have sub-par performance.