We are using Azure and Phoenix to write into an HBase cluster. There are two drivers: one big one and a thin client. The thin client uses Json over HTTP to interact with the database.
When we create a new connection each save we can save at around 150ms per save.
this.conn = DriverManager.getConnection("jdbc:phoenix:thin:url=http://1.1.1.1");
When we reuse the connection we can do it in 70ms per save: a significant speedup. The documentation is a bit vague and shifts between the thick and think client.
So what is the best practice for pooling the connections of the thin client?
Important Change!
We started having some trouble with connections so I want back to our code and made some changes. I set some timers and found the above code works in 0ms. I am not sure what I was doing wrong above.
So the correct way to pool Phoenix is to NOT pool Phoenix. There are several posts that confirm this from the dev team.
Having a huge SQL Oracle/DB2/SqlServer background is probably my undoing. Using Redis or Phoenix or probably any of the new No-sql databases is very different from SQL. My suggestion is "read the directions" of the product you are using and do what they tell you to do!
According to the FAQ from the Phoenix offical site, it is not necessary to pool the Phoenix JDBC Connections.
Phoenix’s Connection objects are different from most other JDBC Connections due to the underlying HBase connection. The Phoenix Connection object is designed to be a thin object that is inexpensive to create. If Phoenix Connections are reused, it is possible that the underlying HBase connection is not always left in a healthy state by the previous user. It is better to create new Phoenix Connections to ensure that you avoid any potential issues.
Implementing pooling for Phoenix could be done simply by creating a delegate Connection that instantiates a new Phoenix connection when retrieved from the pool and then closes the connection when returning it to the pool (see PHOENIX-2388).
I think that the content above in bold is the best practice for pooling the connections of the thin client.
And you can try to set up the configuration below in the client side for tuning the performance, please see more details at Configuration and Tuning.
Update: My simple implement below for the Phoenix connection pool
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.SQLException;
import java.util.Queue;
import java.util.concurrent.ConcurrentLinkedQueue;
public class DataSource {
private String driverClass;
private String jdbcUrl;
private String user;
private String password;
private int min; // The number of connections of initial pool
private int max;
private int used; // The number of connections of polled and not released from poll
private Queue<Connection> pool = null;
public DataSource(String driverClass, String jdbcUrl, String user, String password, int min, int max) {
this.driverClass = driverClass;
this.jdbcUrl = jdbcUrl;
this.user = user;
this.password = password;
this.min = min;
this.max = max;
this.used = 0;
this.pool = new ConcurrentLinkedQueue<Connection>();
init();
}
private void init() {
try {
Class.forName(driverClass);
for (int i = 0; i < min; i++) {
Connection conn = DriverManager.getConnection(jdbcUrl, user, password);
pool.add(conn);
}
} catch (ClassNotFoundException e) {
e.printStackTrace();
} catch (SQLException e) {
e.printStackTrace();
}
}
public Connection getConnection() {
Connection conn = null;
if (pool.size() > 0) {
conn = pool.poll();
Thread connGen = new Thread(new Runnable() {
@Override
public void run() {
try {
Connection conn = DriverManager.getConnection(jdbcUrl, user, password);
pool.add(conn);
} catch (SQLException e) {
e.printStackTrace();
}
}
});
connGen.start();
used ++;
} else if(used < max) {
try {
conn = DriverManager.getConnection(jdbcUrl, user, password);
} catch (SQLException e) {
e.printStackTrace();
}
}
return conn;
}
public void releaseConnection(Connection conn) {
if (conn != null) {
try {
conn.close();
used--;
} catch (SQLException e) {
e.printStackTrace();
}
}
}
}