I'm building a software that is indexing and analyzing a large amount of blockchain data and I'm considering using TimescaleDB for better analytics performance as I'm running into performance issues with only postgres.
My current data structure looks like this:
1 block comes in at a given timestamp It contains 5 transactions, each transaction contains 10 transfers. In the same postgres transaction I'm inserting:
Each row has a timestamp field taken from the block.
Then, asynchronously (because it needs an extra network request), for each transfer I insert a "balance_change" event, that also have this timestamp field.
All four entities are inserted in their own tables.
At first, I was considering turning those tables into hypertables directly but there is some foreign keys referencing to ensure data integrity, and from what I read timescale doesn't support fks from one hypertable to another.
Now I'm thinking about creating an "events" table, and insert everything in it with a timestamp, label ("block", "transaction", "transfer", "balance_change", etc...), a user_id (and other partitioning fields), and a JSONB data field that will contain the actual values.
Or maybe there is something better to do? What about a separate table for each type of event?
I've never used tsdb before, so I'd love some help on how to best design this. Considering also that the quantity of data grows pretty quickly, so I'm trying to avoid duplication as much as possible.
I think the main question is about how you're going to consume your data and how fast it needs to be.
If your need is read fast, optimize it for reading going towards a more detailed structure with already pre-computed values that can make reads even faster.
What about a separate table for each type of event?
If your final app analysis will compute events separated, that can be a nice denormalization.
Check the advantages of narrow vs wide schemas and may you consider using a metadata table