Search code examples
javacollectionsstack-overflow

Is StackOverflowError at a collection's toString() method a bug?


I have written this piece of code to demonstrate:

List<Object> list1 = new ArrayList<>();
List<Object> list2 = new ArrayList<>();
list2.add(list1);
list1.add(list2);
list1.toString();

This code will cause the StackOverflowError.

However I am aware there is some effort in java's collections to prevent this, for example this code will work fine:

List<Object> list1 = new ArrayList<>();
list1.add(list1);
list1.toString();

Some other languages seem to handle it as well (both cases). Is there a reason the first example is not "sanitized" and the second one is? Is it a bug?


Solution

  • It's not exactly a bug - it's more an unfortunate pragmatism. toString should try its hardest not to throw anything (it's a debugging tool, and that'd be quite annoying - there is no real need to report on extraneous circumstances during a toString calculation) - but there are limits to how hard it should try, because toString should also be performant, and easy to write.

    The problem is, java is an OO language, and things are encapsulated. There is no way to know that invoking a toString on a 'component' object is going to end up calling toString back on yourself again just by looking at what the component is. It's just an object after all (lists can hold anything - including other lists, sets, maps, etcetera). Java, unlike some other languages, also has fully extensible core datatypes: You can write your own implementation of list, and many do. (java.util.concurrent has a bunch of very useful implementations of the core collection APIs, for example). Thus, trying to cater to this situation with some sort of internal communication system between collections would put the onus of managing all that on those extending, as well, which is not the pragmatic choice.

    There are some very tricky ways, such as trying to catch StackOverflow, or setting a flag in a ThreadLocal and returning an alternate value if this flag is true (because that implies calling the toString on some component object eventually ended up calling toString on you again), but now you have to spec that toString calculations cannot farm out the work to other threads. It's highly unlikely any would, but adding a whole bunch of bizarre caveats to the toString implementation, or making it into a little framework to communicate the concept of recursive componenting, is all doing exactly what is totally inappropriate for a debugging tool: Make it tightly bound, and complicated.