How would you tackle the following storage and retrieval problem?
Roughly 2.000.000 rows will be added each day (365 days/year) with the following information per row:
entity_id combined with date_id is unique. Hence, at most one row per entity and date can be added to the table. The database must be able to hold 10 years worth of daily data (7.300.000.000 rows (3.650*2.000.000)).
What is described above is the write patterns. The read pattern is simple: all queries will be made on a specific entity_id. I.e. retrieve all rows describing entity_id = 12345.
Transactional support is not needed, but the storage solution must be open-sourced. Ideally I'd like to use MySQL, but I'm open for suggestions.
Now - how would you tackle the described problem?
Update: I was asked to elaborate regarding the read and write patterns. Writes to the table will be done in one batch per day where the new 2M entries will be added in one go. Reads will be done continuously with one read every second.
Use partitioning. With your read pattern you'd want to partition by entity_id
hash.