I did just come up with the following idea, but I lack the knowledge to say if it is applicable to a production application.
We have a web application, built on PHP/mySQL to make it easy. A table in the database are prone to grow large - a couple million of records easily, so table sharding might be an option here.
Here's how I have imagined the process to work:
A cached file contains the a list with the available tables in the database. Each table contains a maximum of a million rows and when that is reached, the cached list is recreated after a new table has been constructed.
Obviously it wouldn't be a good idea to check the number of rows on every write to the table, so this could be done on a set interval, like a week, or daily - depending on how quick every million of data is created.
Would this be a good way to deal with large amount of data and to keep index sizes fairly low?
Thanks
If you are planning ahead for the possibility of enormous growth (game gone viral, for instance) you can follow the steps of those before you and go NoSQL.
Couchbase / powers Zinga (and is a personal favorite)
Apache Cassandra / powers Twitter
mongoDB / powers Craiglist
But you're building a site in php/MySQL to "make it easy" so don't re-invent the wheel on an extremely big problem.
Don't mess with the data. Go for a proven solution. MySQL included.