Search code examples
javacomparisonrdfjenaowl

Comparing two OntModels in Jena


I am trying to compare two different versions of an ontology by using java. OntModels are populated as "RDF/XML-ABBREV". But my results are not same as expected. I am checking if both models are isomorphic and the result is true. But when I iterate through the models it gives me results that they are different. Even if I load same version in both models still it gives me same results. I want to check if any class is deleted or added in the new version. How it is possible check for the changes?

        StmtIterator iter = model.listStatements();
        if (model.isIsomorphicWith(model2))
            System.out.println("Both are isomorphic");
        while (iter.hasNext()) {
            Statement stmt = iter.nextStatement();
            Resource toSearch = iter.nextStatement().getSubject();
            if (!model2.containsResource(toSearch)) 
                System.out.println("statement not found"+ toSearch);
        }

Solution

  • In comparing RDF models, "isomorphic" means that the two models can be mapped 1:1 on to each other. This mapping uses equality for comparing any IRIs or literal values, but uses a correspondence mapping between blank nodes. The reason this is necessary is that by definition, a blank node is unique to the model in which it occurs, so simply comparing blank nodes for equality will always fail.

    Two isomorphic models are, for all intents and purposes, identical. The reason your separate comparison (model2.containsResource) fails is that some resource in your model are blank nodes, so you are trying to compare blank node equality between models, which will fail.

    I want to check if any class is deleted or added in the new version. How it is possible check for the changes?

    The isIsomorphic check will tell you if you need to look for any changes. If two models are not isomorphic, you should be able to find missing subjects in one model or the other disregarding blank nodes:

          Resource toSearch = iter.nextStatement().getSubject();
          if (!toSearch.isAnon() && !model2.containsResource(toSearch)) { 
                  ...
          }
    

    Note that this check isn't full-proof by the way, as you are merely checking for the addition/deletion of a subject. It might very well be that your change added a triple with an already-existing subject - and this check wouldn't find that change. But I'm assuming you only want to test for newly added/removed classes, in which case this should probably work. A more robust test might be that you loop over all rdf:type relations and check that the ones not involving blank nodes exist in both models.