Search code examples
database-performancearangodb

Single vs Multiple databases in arangodb


Data is physically stored in .sst files in a sub-directory engine-rocksdb that resides in the instance’s data directory. A single file can contain documents of various collections and databases. This is from arangodb docs, https://docs.arangodb.com/3.11/concepts/data-structure/databases/#database-organization-on-disk

As per the arangodb documentation, a single sst file may contain data from different collections and databases. Will there be any performance improvements if i store the data in multiple database rather than the single database. Or the multiple databases in arangodb was just a logical separation ?.

Whether storing the data in multiple databases in arangodb have a performance improvements rather than storing all the data in single database.

Thanks.


Solution

  • A single query can only process data from a single database.

    If you can isolate your tables into clusters that could be split across different databases, then your common constraint is the CPU, RAM, and HDD performance of the machine as they share load across the different queries running within their isolated databases.

    Alternately you can look at clusters, and the Enterprise version of the software has some features to help you group data on cluster nodes in a way that improves query performance.

    But unless you have hundreds of millions of rows, or traversals over 10 levels deep (arbitrary depending on your data) you're best to simplify everything and run in a single database.

    A key design approach to protect you from the impact of changing how you access data is to look at Foxx microservices, that way your application converses with the Foxx microservice and is not aware of the structure of the underlying tables.