Search code examples
javajava-14

Are Java Records suitable to represent the nodes of a graph?


One of the classic ways to represent a graph (as in Graph Theory) is:

class Node {
  String value;
  List<Node> children;

  // constructor, equals, etc. are omitted
}

The question is with respect to using the new Records feature introduced in Java 14

Specifically, are there any potential pitfalls with using a declaration like below for an algorithm like DFS:

record Node(String value, List<Node> children) {}

One potential problem is with respect to the equals / hashCode method provided by records. I imagine that, in the case of Node, the implementations provided by record could result in infinite recursion (StackOverflow).

For example, consider the code below that uses a Set collection.

import java.util.*;

class SO {
    
  static void dfs(Node n, Set<Node> visited) {
    if (n == null) return;
    System.out.println("visited " + n.value());
    for (Node child : n.children()) {
      if (visited.contains(child)) continue;
      visited.add(child);
      dfs(child, visited);
    }
  }

  public static void main(String[] args) {
    var a = new Node("a", new ArrayList<>());
    var b = new Node("b", new ArrayList<>());
    a.children().add(b);
    b.children().add(a);
    dfs(a, new HashSet<>());
  }

}

record Node(String value, List<Node> children) {}

This code results in a StackOverflow:

% java SO.java
visited a
Exception in thread "main" java.lang.StackOverflowError
    at Node.hashCode(SO.java:25)
    at java.base/java.util.ArrayList.hashCodeRange(ArrayList.java:595)
    at java.base/java.util.ArrayList.hashCode(ArrayList.java:582)
    at java.base/java.util.Objects.hashCode(Objects.java:103)
    at Node.hashCode(SO.java:25)
    ... <omitted> ...

Does this mean the use of record is generally unsuitable for graph algorithms ? If not, how to correctly implement such algorithms when used with record ?


Solution

  • The stack trace reveals the cause of the stack overflow error.

    The problem is the circular relationship you've created - a is a child of b and b is a child of a - causes an infinite loop when calculating the hash code of the node.

    A record calculates its hash code using all fields. In turn, a List calculates its hash code using the hash code of all of its elements. Because a's list includes b, and b's list includes a, the calculation of hash code, which happens for every object passed to a HashSet method, descends infinitely.

    The fix is to override equals() and hashCode() to not involve children, probably using Object implementation via return super.hashCode(); etc

    Or don't define circular relationships.