Search code examples
exceptionsesamerdf4j

RDF4J RIO UnsupportedRDFormatException when adding data to an HTTPRepository using a stand-alone application


I have an HTTPRepository initialised with a the URL to the repository. I use a RepositoryConnection to retrieve and add (weather) data to the repository. The data is retrieved from a web service, then transformed into RDF statements, and added to the repository. This is done periodically by a stand-alone application.

When I run this application within IntelliJ, everything works fine.

To run this application on a server I created a jar file (containing all dependencies). The application starts as expected and is able to retrieve data from the repository.

However, when the application tries to write data to the repository I get an UnsupportedRDFormatException:

org.eclipse.rdf4j.rio.UnsupportedRDFormatException: Did not recognise RDF format object BinaryRDF (mimeTypes=application/x-binary-rdf; ext=brf)
    at org.eclipse.rdf4j.rio.Rio.lambda$unsupportedFormat$0(Rio.java:568) ~[weatherData-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
    at java.util.Optional.orElseThrow(Optional.java:290) ~[na:1.8.0_111]
    at org.eclipse.rdf4j.rio.Rio.createWriter(Rio.java:134) ~[weatherData-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
    at org.eclipse.rdf4j.rio.Rio.write(Rio.java:371) ~[weatherData-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
    at org.eclipse.rdf4j.rio.Rio.write(Rio.java:324) ~[weatherData-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
    at org.eclipse.rdf4j.repository.http.HTTPRepositoryConnection.addModel(HTTPRepositoryConnection.java:588) ~[weatherData-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
    at org.eclipse.rdf4j.repository.http.HTTPRepositoryConnection.flushTransactionState(HTTPRepositoryConnection.java:662) ~[weatherData-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
    at org.eclipse.rdf4j.repository.http.HTTPRepositoryConnection.commit(HTTPRepositoryConnection.java:326) ~[weatherData-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
    at org.eclipse.rdf4j.repository.base.AbstractRepositoryConnection.conditionalCommit(AbstractRepositoryConnection.java:366) ~[weatherData-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
    at org.eclipse.rdf4j.repository.base.AbstractRepositoryConnection.add(AbstractRepositoryConnection.java:431) ~[weatherData-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
    at nl.wur.fbr.data.weather.WeatherApp.retrieveData(WeatherApp.java:122) ~[weatherData-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
    at nl.wur.fbr.data.weather.WeatherData$WeatherTask.run(WeatherData.java:105) [weatherData-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
    at java.util.TimerThread.mainLoop(Timer.java:555) [na:1.8.0_111]
    at java.util.TimerThread.run(Timer.java:505) [na:1.8.0_111]

The source code in which the error occurs is:

    public void retrieveData(){
        logger.info("Retrieving data for weather for app: "+ID+" ");
        RepositoryConnection connection = null;
        ValueFactory vf = SimpleValueFactory.getInstance();
        try {
            connection = repository.getConnection();

            // Retrieving the locations from the repository (no problem here).
            List<Location> locations = this.retrieveLocations(connection);
            List<Statement> statements = new ArrayList<>();

            // Retrieving weather data from each location and transforming it to statements.
            for(Location location : locations){
                List<Weather> retrievedWeather = weatherService.retrieveWeatherData(location.name,location.latitude,location.longitude);
                for(Weather weather : retrievedWeather){
                    BNode phenomenon = vf.createBNode();
                    statements.add(vf.createStatement(location.ID,WEATHER.HAS_WEATHER,phenomenon,rdfStoreGraph));
                    statements.addAll(weather.getStatements(phenomenon,vf,rdfStoreGraph));
                    statements = this.correctOMIRIs(statements,vf);
                }
            }

            // Adding data retrieved from the weather API
            // This is where the exception happens.
            connection.add(statements,rdfStoreGraph);

        } catch (Exception e) {
            logger.error("Could not retrievedata for weather app: '"+ID+"' because no monitor locations could be found.",e);
        } finally {
            if(connection != null){
                connection.close();
            }
        }
    }

The HTTPRespository is initialised as so:

        repository = new HTTPRepository(rdfStore.toString());
        ((HTTPRepository)repository).setPreferredRDFFormat(RDFFormat.BINARY);
        ((HTTPRepository)repository).setPreferredTupleQueryResultFormat(TupleQueryResultFormat.BINARY);

I've tried changing the formats to TURTLE. But it makes no difference.

Can you tell me how to solve this?

NB. Both the RDF4J server and library have version 2.0.1 (rdf4j).


Solution

  • To run this application on a server I created a jar file (containing all dependencies).

    There's your problem: you created a "fat jar" and probably haven't properly merged SPI registry files.

    RDF4J's Rio parsers (and several other modules as well) use Java's Service Provider Interface (SPI) mechanism to register themselves. This mechanism relies on a text file in META-INF\services in the jar file containing the fully-qualified name of each parser implementation.

    The problem comes when you merge jars: each Rio parser jar has a registry file with the same name, but different contents. If you are using something like the maven assembly plugin to create the fat jar, each registry file gets overwritten by the next one. The consequence is that at the end, RDF4J can only find one parser - the one whose registry file was added last to the fat jar.

    The solution is to either not create a fat jar at all, or if you must, use a different technique to create it, which merges the registry files rather than overwriting them. The maven shade plugin has a good config option for this: the ServicesResourceTransformer.