Search code examples
javajgrapht

Import a edgelist graph using its elements as labels with JGraphT 1.4.0


I updated a project of mine from JGraphT 1.3.1 to 1.4.0 and noticed that a new org.jgrapht.nio package has been introduced for I/O; I thought to switch to it, since basically org.jgrapht.io has been deprecated and I'd like my work to be future-proof for some years on.

My problem is, after replacing deprecated classes, that edge lists like

a b
b c
b d
a d

(where blank character is set as separator) are no more imported with labels as alphabetic characters but as indexed occurrences, i.e. the above edge list becomes

0 1
1 2
1 3
0 3

You may reproduce this behaviour by taking the CSVImporterTest.java test class and replacing its nodes in one of its test methods with alphabetic characters: the test will fail because then the graph is created by the builder, the vertex supplier is given by SupplierUtil.createStringSupplier(1) invocation that basically generates numbers as strings instead of picking vertices from edge list.

Since user guide about serialization hasn't been updated to 1.4.0 yet and doesn't include any example of use of org.jgrapht.nio package, and since it's plain clear I didn't get a thing about how to restore the behaviour of org.jgrapht.io.CSVImporter, how am I suppose to actually read nodes from edge list instead of counting them? Do I have to add some more processing to convert those indexes back to alphabetic letters?

I even tried to build a lambda function by myself, since builder vertexSupplier(Supplier<T>) takes a Supplier<T> as its input, but I got stuck at () -> T t, where t is clearly undefined and should be taken somewhere from file.


Solution

  • The new I/O package was redesigned from scratch and indeed it changes the semantics during graph creation. This is mainly the reason for switching package names from org.jgrapht.io to org.jgrapht.nio.

    During the last years vertex/edge creation has been improved in the graphs using graph vertex and edge suppliers. The new I/O importers switch behavior and call Graph#addVertex() whenever a new vertex is required, which in turn uses the provided graph vertex supplier to create the vertex.

    Unfortunately this leads to your observed behavior. The actual vertex identifiers from the input file are still accessible as they are considered vertex attributes and are reported during import using a key of "ID".

    See also Import graph with 1.4.0 which is a very similar case and uses this functionality to create a second graph with the exact same identifiers.

    On the other hand, the old behavior makes sense. It is natural to expect that the import retains your vertex identifiers (at least for most importers). There is already a fix for this by providing a method #setVertexFactory(Function). This method allows the user to bypass vertex creation by providing a custom vertex factory method. The factory method is responsible to create a new graph vertex given the vertex identifier read from the input file.

    The fix will be available in the next release (probably 1.4.1) and is already available in the snapshot build (1.4.1-SNAPSHOT). See https://github.com/jgrapht/jgrapht#using-via-maven on how to use the snapshot build.

    In order to retain the old behavior you should build your importer like:

    CSVImporter<String, DefaultEdge> importer = new CSVImporter<>();
    importer.setVertexFactory(id->id);
    importer.importGraph(g, new StringReader(input));