I have a JRuby application that I am trying to create has_many through:
relationships for databases that live on two completely different servers. I understand that joins will not work across tables on different servers. What I would like is to simulate the joins so that a developer using the model doesn't have to be (as) aware of the cross-server join.
There are some additional complexities in this setup:
has_and_belongs_to_many
was in it.I've considered writing my own custom association, but that's a bit complex and I can't find any guides or really any starting points other then reading through the Rails code.
Is there an easy way to do this that I'm missing?
Is building a custom ActiveRecord association the best way to do this? and if so, where do I start?
Code similar to my setup:
config/database.yml
development:
adapter: postgresql
encoding: unicode
database: main
username: username
password: password
host: localhost
pool: 5
remote_development: # Read only
adapter: jdbcmssql
driver: com.microsoft.sqlserver.jdbc.SQLServerDriver
url: 'jdbc:sqlserver://foo.com;databaseName=main'
username: username
password: password
app/models/account.rb
class Portfolio < ActiveRecord::Base
#has_and_belongs_to_many :dim_users, join_table: :accounts_dim_user
end
app/models/remote_model_base.rb
class RemoteModelBase
require "#{Rails.root}/lib/sqljdbc4.jar"
self.abstract_class = true
establish_connection "remote_#{Rails.env}".to_sym
after_initialize :readonly!
end
app/models/dim_user.rb
class DimUser < RemoteModelBase
self.table_name = 'DimUser'
self.primary_key = 'dwidDimUser'
#has_and_belongs_to_many :accounts, join_table: :accounts_dim_user
end
config/schema.rb
ActiveRecord::Schema.define(version: 20140925200106) do
create_table "accounts", force: true do |t|
t.string "name"
t.datetime "created_at"
t.datetime "updated_at"
end
create_table "accounts_dim_user", force: true, id: false do |t|
t.integer "dwidUser"
t.integer "account_id"
t.datetime "created_at"
t.datetime "updated_at"
end
# Defined in the remote database but it might look something like this
# create_table "DimUser" do |t|
# t.integer dwidUser
# # ...
# end
Just ran through a quick scenario for you, here's the repo: https://github.com/beneggett/many_db_example
In the repo, I just did 2 different db's on my local, but it won't matter, the principal is the same:
This seemed to work well for me:
Tell the account about the account_dim_users join table association, but manually map the has_many through/habtm.
class Account < ActiveRecord::Base
has_many :account_dim_users
def dim_users
account_dim_users.map {|account_dim_user| DimUser.find_by(dwidUser: account_dim_user.dwidUser) }
end
end
This is important, because, as you well know, the standard join will not work; however mapping it through the model works fine.
AccountDimUser join table looks standard (I explicitly mapped the keys)
class AccountDimUser < ActiveRecord::Base
has_many :accounts
has_many :dim_users, primary_key: :dwidUser, foreign_key: :dwidUser
end
Manually map the account_dim_users association, and manually map the accounts association
class DimUser < ActiveRecord::Base
establish_connection "other_db".to_sym
after_initialize :readonly!
self.table_name = 'DimUser'
self.primary_key = 'dwidUser'
def account_dim_users
AccountDimUser.where(dwidUser: self.dwidUser)
end
def accounts
account_dim_users.map {|account_dim_user| Account.find(account_dim_user.account_id) }
end
end
This approach allows you to still use your Ruby objects in the standard way:
a = Account.first
Account Load (0.6ms) SELECT "accounts".* FROM "accounts" ORDER BY "accounts"."id" ASC LIMIT 1
=> #<Account:0x00000102d263d0> {
:id => 1,
:name => "New account",
:created_at => Mon, 29 Sep 2014 15:07:07 UTC +00:00,
:updated_at => Mon, 29 Sep 2014 15:07:07 UTC +00:00
}
--
a.account_dim_users
=> #<ActiveRecord::Associations::CollectionProxy [#<AccountDimUser id: 1, dwidUser: 1, account_id: 1, created_at: "2014-09-29 15:08:47", updated_at: "2014-09-29 15:08:47">, #<AccountDimUser id: 3, dwidUser: 5, account_id: 1, created_at: "2014-09-29 15:24:17", updated_at: "2014-09-29 15:25:06">]>
--
a.dim_users
AccountDimUser Load (0.3ms) SELECT "account_dim_users".* FROM "account_dim_users" WHERE "account_dim_users"."account_id" = $1 [["account_id", 1]]
DimUser Load (0.9ms) SELECT "DimUser".* FROM "DimUser" WHERE "DimUser"."dwidUser" = 1 LIMIT 1
DimUser Load (0.3ms) SELECT "DimUser".* FROM "DimUser" WHERE "DimUser"."dwidUser" = 5 LIMIT 1
=> [
[0] #<DimUser:0x0000010981af10> {
:id => 1,
:dwidUser => 1,
:created_at => Mon, 29 Sep 2014 15:06:44 UTC +00:00,
:updated_at => Mon, 29 Sep 2014 15:06:44 UTC +00:00
},
[1] #<DimUser:0x00000109838b00> {
:id => 5,
:dwidUser => 5,
:created_at => Mon, 29 Sep 2014 15:23:01 UTC +00:00,
:updated_at => Mon, 29 Sep 2014 15:23:01 UTC +00:00
}
]
--
d = DimUser.first
DimUser Load (0.5ms) SELECT "DimUser".* FROM "DimUser" ORDER BY "DimUser"."dwidUser" ASC LIMIT 1
=> #<DimUser:0x0000010990aad8> {
:id => 1,
:dwidUser => 1,
:created_at => Mon, 29 Sep 2014 15:06:44 UTC +00:00,
:updated_at => Mon, 29 Sep 2014 15:06:44 UTC +00:00
}
--
d.account_dim_users
AccountDimUser Load (0.5ms) SELECT "account_dim_users".* FROM "account_dim_users" WHERE "account_dim_users"."dwidUser" = 1
=> #<ActiveRecord::Relation [#<AccountDimUser id: 1, dwidUser: 1, account_id: 1, created_at: "2014-09-29 15:08:47", updated_at: "2014-09-29 15:08:47">]>
--
d.accounts
AccountDimUser Load (0.5ms) SELECT "account_dim_users".* FROM "account_dim_users" WHERE "account_dim_users"."dwidUser" = 1
Account Load (0.4ms) SELECT "accounts".* FROM "accounts" WHERE "accounts"."id" = $1 LIMIT 1 [["id", 1]]
=> [
[0] #<Account:0x000001099788d0> {
:id => 1,
:name => "New account",
:created_at => Mon, 29 Sep 2014 15:07:07 UTC +00:00,
:updated_at => Mon, 29 Sep 2014 15:07:07 UTC +00:00
}
]
There is probably some optimizations that could be done to this when dealing with large quantities of records, but it's a good base.
The other approach might be to do lookups on the association table itself, like:
def find_dim_user
DimUser.find_by(dwidUser: self.dwidUser)
end
But I very much prefer the first way I suggested, as it lets you do your normal ruby method chaining approach to associations.
Any other questions, let me know!
Edit: You could change the map function to use Active Record Relations or similar as well, enabling more functionality:
class Account < ActiveRecord::Base
has_many :account_dim_users
def dim_users
dim_user_ids = account_dim_users.map {|account_dim_user| account_dim_user.dwidUser }
DimUser.where(dwidUser: dim_user_ids)
end
end