I run into a problem while querying a sesame triple store using SPARQL. After I've run several queries successfully, the connection to the triple store blocks. I've been able to pinpoint the problem at AbstractConnPool.getPoolEntryBlocking
line 306: success = future.await(deadline);
of the apache-httpcomponents
library. If I understand correctly this method blocks when the maximum number of connections in exceeded. The maximum is 5 connections, and indeed the number of open connections in the pool at that point is 5.
What I do not understand is why there are 5 open connections at that point.
The problem happens when I call the evaluate
method on a TupleQuery
. For every time I open a new connection with
connection = repository.getConnection();
I also close it:
} finally {
if(connection!=null){
try {
connection.close();
nclosedconnections++;
System.out.println("Connections: "+nconnections+" closed: "+nclosedconnections);
} catch (RepositoryException e) {
throw new SearchException("Could not close the triple store as a search engine.",this,null,e);
}
}
}
I've checked how often a RepositoryConnection
is openend and how often it is closed. When the method blocks, a RepositoryConnection
has been opened 6 times and has been closed 5 times, as expected.
Each connection is also used only once (i.e. for one SPARQL query). I've also tried reusing the connection but then I still get the same block.
Do you have any idea, why this goes wrong, and how I can solve this problem?
NB. The Sesame repository runs on tomcat and the connection is made via HTTP, i.e. the repository is an HTTPRepository
and is created by:
repository = new HTTPRepository(repositoryURL);
repository.initialize();
I've also checked the sesame logs on the server, but no request is received by the sesame server. The problem seems to be on the client-side where no request is send.
NB2. Below is a more complete code-snippet:
RepositoryConnection connection = null;
String sparql = "" +
"SELECT * WHERE {\n" +
" OPTIONAL{ <"+result.getURI()+"> <" + DC.TITLE+ "> ?title. }"+
" OPTIONAL{ <"+result.getURI()+"> <" + RDFS.LABEL+ "> ?label. }"+
" OPTIONAL{ <"+result.getURI()+"> <" + SKOS.PREF_LABEL+ "> ?prefLabel. }"+
" OPTIONAL{ <"+result.getURI()+"> <" + SKOS.ALT_LABEL+ "> ?altLabel. }"+
" OPTIONAL{ <"+result.getURI()+"> <" + DC.DESCRIPTION+ "> ?description. }"+
" OPTIONAL{ <"+result.getURI()+"> <" + RDFS.COMMENT+ "> ?comment. }"+
"}\n";
try{
connection = repository.getConnection();
nconnections++;
System.out.println("Connections: "+nconnections+" closed: "+nclosedconnections);
TupleQuery query = connection.prepareTupleQuery(QueryLanguage.SPARQL,sparql);
query.setMaxExecutionTime(2);
TupleQueryResult results = query.evaluate();
while (results.hasNext()){
...
}
} catch (RepositoryException e) {
throw new SearchException("Could not access the triple store as a search engine.",this,null,e);
} catch (QueryEvaluationException e) {
throw new SearchException("Could retrieve data from the triple store as the SPARQL query could not be evaluated. SPARQL:\n"+sparql,this,null,e);
} catch (MalformedQueryException e) {
throw new SearchException("Could retrieve data from the triple store as the SPARQL query was malformed. SPARQL:\n"+sparql,this,null,e);
} finally {
if(connection!=null){
try {
connection.close();
nclosedconnections++;
System.out.println("Connections: "+nconnections+" closed: "+nclosedconnections);
} catch (RepositoryException e) {
throw new SearchException("Could not close the triple store as a search engine.",this,null,e);
}
}
}
The reason this happens is that you do not invoke result.close()
on the TupleQueryResult
after you are done with it.
The Sesame API mandates that you explicitly invoke close()
on query results and iterations after you are done with them. To quote from the programmers' manual:
[...] it is important to invoke the close() operation on the TupleQueryResult, after we are done with it. A TupleQueryResult evaluates lazily and keeps resources (such as connections to the underlying database) open. Closing the TupleQueryResult frees up these resources. Do not forget that iterating over a result may cause exceptions! The best way to make sure no connections are kept open unnecessarily is to invoke close() in the finally clause.
The recommended pattern is to use a try-finally
block:
TupleQueryResult result = tupleQuery.evaluate();
try {
while (result.hasNext()) {
// process result items
}
}
finally {
result.close();
}
The reason you didn't have this problem when using an older version of Sesame, by the way, is that there was an undocumented feature which automatically closed a query result when it was fully exhausted. In release 2.8, processing of query results over HTTP was completely reimplemented, and this undocumented feature was not part of it. So, while not strictly speaking a bug (the 'official' way has always been that you need to close it yourself), it is a regression from in-practice behavior. I have logged this as an issue (see SES-2323), and it will be fixed in the next patch release.
By the way, there are several ways to make query processing a little easier, especially if you do not particularly need to do streaming processing of the result. For example, you can do something like this:
List<BindingSet> results = QueryResults.asList(query.evaluate());
which pulls the entire query result into a simple List
, and automatically closes the underlying QueryResult
for you.
Also this: in the upcoming Sesame 4.0 release (currently available as 4.0.0-RC1) a lot of this is made far more elegant, by using new Java 7/8 features such as AutoCloseable
and lambda-expressions.