Search code examples
javaoptimizationrecursioncompiler-constructionjit

Can Java compiler optimize adding to a set in recursive methods


Simple question asked mostly out of curiosity about what java compiler's are smart enough to do. I know not all compilers are built equally, but I'm wondering if others feel it's reasonable to expect an optimization on most compilers I'm likely to run against, not if it works on a specific version or on all versions.

So lets say that I have some tree structure and I want to collect all the descendant of a node. There are two easy ways to do this recursively.

The more natural method, for me, to do this would be something like this:

public Set<Node> getDescendants(){

   Set<Node> descendants=new HashSet<Node>();
   descendants.addall(getChildren());

   for(Node child: getChildren()){
      descendants.addall(child.getDescendants());
   }

   return descendants;
}

However, assuming no compiler optimizations and a decent sized tree this could get rather expensive. On each recursive call I create and fully populate a set, only to return that set up the stack so the calling method can add the contents of my returning set to it's version of the descendants set, discarding the version that was just built and populated in the recursive call.

So now I'm creating many sets just to have them be discarded as soon as I return their contents. Not only do I pay a minor initialization cost for building the sets, but I also pay the more substantial cost of moving all the contents of one set into the larger set. In large trees most of my time is spent moving Nodes around in memory from set A to B. I think this even makes my algorithm O(n^2) instead of O(n) due to the time spent copying Nodes; though it may work out to being O(N log(n)) if I set down to do the math.

I could instead have a simple getDescendants method that calls a helper method that looks like this:

public Set<Node> getDescendants(){
    Set<node> descendants=new HashSet<Node>();
    getDescendantsHelper(descendants);   

    return descendants;
}

public Set<Node> getDescendantsHelper(Set<Node> descendants){

   descendants.addall(getChildren());

   for(Node child: getChildren()){
      child.getDescendantsHelper(descendant);
   }

   return nodes;
}

This ensures that I only ever create one set and I don't have to waste time copying from one set to another. However, it requires writing two methods instead of one and generally feels a little more cumbersome.

The question is, do I need to do option two if I'm worried about optimizing this sort of method? or can I reasonably expect the java compiler, or JIT, to recognize that I am only creating temporary sets for convenience of returning to the calling method and avoid the wasteful copying between sets?

edit: cleaned up bad copy paste job which lead to my sample method adding everything twice. You know something is bad when your 'optimized' code is slower then your regular code.


Solution

  • The question is, do I need to do option two if I'm worried about optimizing this sort of method?

    Definitely yes. If performance is a concern (and most of the time it is not!), then you need it.

    The compiler optimizes a lot but on a very different scale. Basically, it works with one method only and it optimizes the most commonly used path there in. Due to heavy inlining it can sort of optimize across method calls, but nothing like the above.

    It can also optimize away needless allocations, but only in very simple cases. Maybe something like

    int sum(int... a) {
        int result = 0;
        for (int x : a) result += x;
        return result;
    }
    

    Calling sum(1, 2, 3) means allocating int[3] for the varargs arguments and this can be eliminated (if the compiler really does it is a different question). It can even find out that the result is a constant (which I doubt it really does). If the result doesn't get used, it can perform dead code elimination (this happens rather often).

    Your example involves allocating a whole HashMap and all its entries, and is several orders of magnitude more complicated. The compiler has no idea how a HashMap works and it can't find out e.g., that after m.addAll(m1) the set m contains all member of m1. No way.

    This is an algorithmical optimization rather than low-level. That's what humans are still needed for.

    For things the compiler could do (but currently fails to), see e.g. these questions of mine concerning associativity and bounds checks.