Search code examples
ruby-on-railsrubymultithreadingconnection-poolingdatabase-concurrency

Close Rails ActiveRecord Connection Pool


I am using a second database with datasets within my API.

Every API request can have up to 3 queries on that Database so I am splitting them in three Threads. To keep it Thread safe I am using a connection pool.

But after the whole code is run the ConnectionPool thread is not terminated. So basically every time a request is made, we will have a new Thread on the server until basically there is no memory left.

Is there a way to close the connection pool thread? Or am I doing wrong on creating a connection pool per request?

I setup the Connection Pool this way:

begin
  full_db = YAML::load(ERB.new(File.read(Rails.root.join("config","full_datasets_database.yml"))).result)
  resolver = ActiveRecord::ConnectionAdapters::ConnectionSpecification::Resolver.new(full_db)
  spec = resolver.spec(Rails.env.to_sym)
  pool = ActiveRecord::ConnectionAdapters::ConnectionPool.new(spec)

Then I am running through the queries array and getting the results to the query

returned_responses = []
queries_array.each do |query|
  threads << Thread.new do
    pool.with_connection do |conn|
     returned_responses << conn.execute(query).to_a
    end
  end
end

threads.map(&:join)

returned_responses

Finally I close the connections inside the connection pool:

ensure
 pool.disconnect!
end

Solution

  • After some time spent, I ended up finding an answer. The generic idea came from @anothermg but I had to do some changes in order to work in my version of rails (5.2).

    I setup the database in config/full_datasets_database.yml I had the following initializer already:

    #! config/initializers/db_full_datasets.rb
    DB_FULL_DATASETS = YAML::load(ERB.new(File.read(Rails.root.join("config","full_datasets_database.yml"))).result)[Rails.env]
    

    I created the following model to create a connection to the new database:

    #! app/models/full_datasets.rb
    class FullDatasets < ActiveRecord::Base
      self.abstract_class = true
    
      establish_connection DB_FULL_DATASETS
    end
    

    On the actual module I added the following code:

      def parallel_queries(queries_array)
        returned_responses = []
        threads = []
    
        conn = FullDatasets.connection_pool
    
        queries_array.each do |query|
          threads << Thread.new do
            returned_responses << conn.with_connection { |c| c.execute(query).to_a }
          end
        end
    
        threads.map(&:join)
    
        returned_responses
      end