mysql ruby-on-rails postgresql rails-postgresql multi-tenant

Rails multitenant architecture, scoping access to multiple tenants

We have a single-tenant database architecture at the moment with MySQL running upward of 100 databases. We switch database connection on subdomain using the Apartment gem and all is dandy!

However, we have a requirement now to create so-called "Umbrella" clients which can access all the data from a group of our existing clients. I don't see this as immediately feasible with our single-tenant database architecture (I looked into it and querying multiple MySQL databases just seemed hellish), so I'm beginning to look at different implementations with Postgres schemas.

I'm looking for some advice:

is it possible to query multiple schemas in Postgres and collate the results somehow (looking for Rails implementation)? I can foresee problems with conflicting primary keys?
Would it be better to have a new schema that is somehow represents/duplicates all the data in the group of schemas that need to be accessed? It would need to be realtime.
If so, can something similar be achieved in my current multiple DB set-up with MySQL? (to minimise the pain)

I'm wary of using a database field to achieve multitenancy in MySQL as data security/privacy is a huge thing for this product, and there's so much potential for developer error that way.

Solution

just thinking through this on a high level approach to this problem.

You could create a pg view table to access this data (although it will be slower than accessing the databases themselves).

Then you hopefully have enough unique fields in your table to create a compound or composite key. (Then you wouldn't have to create a new key column, just an index). Because Rails 3 is ORM agnostic, you could then use DataMapper (or maybe the new ROM gem) to establish the connection for this one model.

If you do compound keys, realize that you might have to explicitly define the *to_param* method in your model to build the key as a string. This is for unwrapping the :id when you send it in a url.

You can setup access to this view through a different Postgres user and then use Rails' ability for multiple connections to create a model for it. We did this previously to aggregate data from multiple tables with some restrictions on what was being shown, but I don't see why that wouldn't apply in your use case.

Another option is that perhaps you could use Mongo as the "transient query database". BSON would give you unique keys automatically. And you could create Objects that are essentially SQL Scalar Objects. Not sure you'd want to do a write back to the original database in this case though... but you feasibly could do it.

Bottom line IMO is I think the best solution lies on the database side of the house because you're using multiple databases. Dealing with these items at the database layer seems the best solution.

Having said all of this... this also seems like a process smell. If I read the problem as stated correctly, I think what you're really trying to do in this case is what Hadoop is designed for... essentially map/reduce of relevant data (aka Big Data Analysis)

Good luck!