I may have a stupid question, but I can't find any explicit elements on this, so I get a shoot :
It seems that the design pattern in xml database is to consider xml files as "rows" from Relational ones. I found explicit recommandations on this for MarkLogic where using multiple xml files is recommended over big ones. But can't find the same on eXist-db. Is it recommended too ? I mean, should I use multiple (thousand ?) xml files or a big one ? What is preferable with exist-db ?
I could have some dummy example :
Customers (let's say 100,000) with personnal informations (let's say 15 xml elements with text content) : One xml file per customer or one file with all the customers ?
For queries it doesn't make a big difference if the data is stored in many small or one big document. For updates, small documents are often preferable though. It is usually more efficient to replace a small document instead of running updates on a large one.
It really depends on the frequency of data changes. If updates occur at high frequency, maintaining small documents is likely more efficient. To simplify maintenance and improve performance you may even consider organising the documents into smaller sub-collections depending on some criteria. Note: do not forget to increase the collectionCache setting if you work with many thousand small docs.