ruby-on-rails ruby database activerecord

best database strategy for a client-based website (Ruby on Rails)

I've built a nice website system that caters to the needs of a small niche market. I've been selling these websites over the last year by deploying copies of the software using Capistrano to my web server.

It occurs to me that the only difference in these websites is the database, the CSS file, and a small set of images used for the individual client's graphic design.

Everything else is exactly the same, or should be... Now that I have about 20 of these sites deployed, it is getting to be a hassle to keep them all updated with the same code. And this problem will only get worse.

I am thinking that I should refactor this system, so that I can use one set of deployed ruby code, dynamically selecting the correct database, etc, by the URL of the incoming request.

It seems that there are two ways of handling the database:

using multiple databases, one for each client
using one database, with a client_id field in each table, and an extra 'client' table

The multiple database approach would be the simplest for me at the moment, since I wouldn't have to refactor every model in my application to add the client_id field to all CRUD operations.

However, it would be a hassle to have to run 'rake db:migrate' for tens or hundreds of different databases, every time I want to migrate the database(s). Obviously this could be done by a script, but it doesn't smell very good.

On the other hand, every client will have 20K-50K items in an 'items' table. I am worried about the speed of fulltext searches when the items table has a half million or million items in it. Even with an index on the client_id field, I suspect that searches would be faster if the items were separated into different client databases.

If anyone has an informed opinion on the best way to approach this problem, I would very much like to hear it.

Solution

There are advantages to using separate DBs (including those you already listed):

Fulltext searches will become slow (depending on your server's capabilities) when you have millions of large text blobs to search.
Separating the DBs will keep your table indexing speed quicker for each client. In particular, it might upset some of your earlier adopting clients if you take on a new, large client. Suddenly their applications will suffer for (to them) no apparent reason. Again, if you stay under your hardware's capacity, this might not be an issue.
If you ever drop a client, it'd be marginally cleaner to just pack up their DB than to remove all of their associated rows by client_id. And equally clean to restore them if they change their minds later.
If any clients ask for additional functionality that they are willing to pay for, you can fork their DB structure without modifying anyone else's.
For the pessimists: Less chance that you accidentally destroy all client data by a mistake rather than just one client's data. ;)

All that being said, the single DB solution is probably better given:

Your DB server's capabilities makes the large single table a non-issue.
Your client's databases are guaranteed to remain identical.
You aren't worried about being able to keep everyone's data compartmentalized for purposes of archiving/restoring or in case of disaster.