I'm going to use Xodus for storing time-series data (100-500 million rows are inserted daily.)
I saw that Xodus was creating and deleting a lot of .xd files in the background. I read about log-structured design, but I don't clearly understand whether file is created on each transaction commit. Is each file represents snapshot of whole database? Is there any way to disable transactions (i don't need it) ?
Can I get any performance benefits by sharding my data between different stores ? I can store every metric in separate store instead of using one store with multikey. For now I'm creating separate store for each day
The .xd
files don't actually represent certain transactions. The files are ordered, so they can be thought as an infinite log of records. Each transaction writes the changes and some meta information for making it possible to retrieve/search for saved data. Any .xd
file has its maximum size, and when it is reached the new file is created.
It is not possible to disable transactions.
Basically, sharding your data between different stores gives better performance, at least the smaller the stores are, the faster and smoother GC works in background. The way you shard your data defines the way you can retrieve it. If data in different shards is completely decoupled than it is even better to store shards in different environments, not stores of a single environment. This will also physically isolate data in different shards, not only logically.