I run several sql statements in a transaction using Ruby pg gem. The problem that I bumped in is that connection times out on these queries due to firewall setup. Solution proposed here does not work, because it requires jdbc connections string, and I'm in Ruby (jRuby is not an option). Moving driver program to AWS to remove firewall is not an option either.
The code that I have is along the following lines:
conn = RedshiftHelper.get_redshift_connection
begin
conn.transaction do
# run my queries
end
ensure
conn.flush
conn.finish
end
I'm now looking into PG asynchronous API. I'm wondering if I can use is_busy to prevent firewall from timing out, or something to that effect. I can't find good documentation on the topic though. Appreciate any hints on that.
PS: I have solved this problem for a single query - I can trigger it asynchronously and track its completion using system STV_INFLIGHT Redshift table.Transaction does not work this way as I have to keep connection open.
Ok, I nailed it down. Here are the facts:
Based on the facts above, tcp keepalive is not supported by Redshift. However, PG allows you to retrieve a socket that is used in the established connection. This means that even though libpq does not set keepalive feature, we still can use it manually. The solution thus:
class Connection
attr_accessor :socket, :pg_connection
def initialize(conn, socket)
@socket = socket
@pg_connection = conn
end
def method_missing(m, *args, &block)
@pg_connection.send(m, *args, &block)
end
def close
@socket.close
@pg_connection.close
end
def finish
@socket.close
@pg_connection.close
end
end
def get_connection
conn = PGconn.open(...)
socket_descriptor = conn.socket
socket = Socket.for_fd(socket_descriptor)
# Use TCP keep-alive feature
socket.setsockopt(Socket::SOL_SOCKET, Socket::SO_KEEPALIVE, 1)
# Maximum keep-alive probes before asuming the connection is lost
socket.setsockopt(Socket::IPPROTO_TCP, Socket::TCP_KEEPCNT, 5)
# Interval (in seconds) between keep-alive probes
socket.setsockopt(Socket::IPPROTO_TCP, Socket::TCP_KEEPINTVL, 2)
# Maximum idle time (in seconds) before start sending keep-alive probes
socket.setsockopt(Socket::IPPROTO_TCP, Socket::TCP_KEEPIDLE, 2)
socket.autoclose = true
return Connection.new(conn, socket)
end
The reason why I introduce a proxy Connection
class is because of Ruby tendency to garbage-collect IO
objects (like sockets) when they get out of scope. This means that we now need connection and socket to be in the same scope, which is achieved through this proxy class. My Ruby knowledge is not deep, so there may be a better way to handle the socket object.
This approach works, but I would be happy to learn if there are better/cleaner solutions.