Search code examples
connection-poolingjena

How can I change the number of connections par route in Jena, more than 5


I 'd like to know how can I modify the number of connections par route in jena, when I try make more than 5 queries the thread is locked. I tried modify some parameters in QueryEngineHTTP but it doesn't work (I tried qexec.addparam("max connections",10) and other variations too). The project is about getting every node in a thesaurus (nodes which are connected with property narrower) so I used a recursive function and after 5 recursive cycles I'm locked. Thank you so much!

    Query query = QueryFactory.create(queryString);
    if (!occurs.contains(entity)) {
        // print depth times "\t" to retrieve an explorer tree like output
        for (int i = 0; i < depth; i++) {
            System.out.print("\t");
        }
        // print out the URI
        System.out.println(entity); 

        try ( QueryExecution qexec = QueryExecutionFactory.sparqlService("http://data.bnf.fr/sparql", query); ) {

            QueryEngineHTTP objectToExec= (QueryEngineHTTP) QueryExecutionFactory.sparqlService("http://data.bnf.fr/sparql",query);
          //  objectToExec.addParam("timeout","500000"); //5 sec
            objectToExec.addParam("http.conn-manager.timeout","10");                 ResultSet results = qexec.execSelect();

                while (results.hasNext()) {
                    QuerySolution soln = results.nextSolution();
                    RDFNode sub = soln.get("pL");
                    // System.out.println("sub "+sub.toString());
                    if (!sub.isURIResource()) continue;
                    // push this expression on the occurs list before we recurse to avoid loops
                    occurs.add(entity);
                    // traverse down and increase depth (used for logging tabs)
                    traverse(sub.toString(), occurs, depth + 1);
                    // after traversing the path, remove from occurs list
                    occurs.remove(entity);
                }
        }
    }

Solution

  • The default HttpClient setup used by Apache Jena has a maximum of 5 connections per-route (see the code)

    You need to configure the default HttpClient instance by creating your desired HttpClient and then using the HttpOp.setDefaultHttpClient() method to set it as the default client.

    Refer to the HttpClient Connection Management documentation for how to configure a client appropriately. You can use the aforementioned code link as a basis and modify accordingly. For example to have 20 max connections per-route and 100 total:

    HttpClient client =
        HttpClientBuilder.create()
            .useSystemProperties()
            .setRedirectStrategy(new LaxRedirectStrategy())
            .setMaxConnPerRoute(20)
            .setMaxConnTotal(100);
    HttpOp.setDefaultHttpClient(client);
    

    Please note that you also need to make sure that you are freeing your connections properly, I see you use try-with-resources on your QueryExecution which should automatically call close() on the execution which frees the underlying connection. But you don't ever call close() on the ResultSet which might also be holding a reference to the connection. So it never hurts to call close() explicitly when you are done with a result set or a query execution.

    Since you are doing a recursive call you may want to call close() prior to recursing as otherwise you will still run into this eventually if your hierarchies are very deep. As you appear to potentially recurse on every result you may not be able to close it immediately so it may be useful to actually grab a copy of the result set so you can close the execution prior to looping over the results using ResultSetFactory.copyResults()