Search code examples
javamultithreadingasynchronousgarbage-collectioncompletable-future

CompletableFuture Chain uncompleted -> Garbage Collector?


if i have one (or more) CompletableFuture not started yet, and on that method(s) a few thenApplyAsync(), anyOf()-methods.

Will the Garbage Collector remove all of that?

If there is a join()/get() at the end of that chain -> same question: Will the Garbage Collector remove all of that?

Maybe we need more information about that context of the join().

That join is in a Thread the last command, and there are no side-effects. So is in that case the Thread still active? - Java Thread Garbage collected or not

Anyway is that a good idea, to push a poisen-pill down the chain, if im sure (maybe in a try-catch-finally), that i will not start that Completable-chain, or is that not necessary?

The question is because of something like that? (https://bugs.openjdk.java.net/browse/JDK-8160402)

Some related question to it: When is the Thread-Executor signaled to shedule a new task? I think, when the CompletableFuture goes to the next chained CompletableFuture?. So i must only carry on memory-leaks and not thread-leaks?

Edit: What i mean with a not started CompletableFuture?

i mean a var notStartedCompletableFuture = new CompletableFuture<Object>(); instead of a CompletableFuture.supplyAsync(....);

I can start the notStartedCompletableFuture in that way: notStartedCompletableFuture.complete(new Object); later in the program-flow or from another thread.

Edit 2: A more detailed Example:

AtomicReference<CompletableFuture<Object>> outsideReference=new AtomicReference<>();

final var myOuterThread = new Thread(() ->
{
    final var A = new CompletableFuture<Object>();
    final var B = new CompletableFuture<Object>();

    final var C = A.thenApplyAsync((element) -> new Object());
    final var D = CompletableFuture.anyOf(A, C);

    A.complete(new Object());

    // throw new RuntimeException();

    //outsideReference.set(B);

    ----->B.complete(new Object());<------ Edit: this shouldn't be here, i remove it in my next iteration

    D.join();

});

myOuterThread.start();

//myOutherThread variable is nowhere else referenced, it's sayed so a local variable, to point on my text on it^^

  1. So in the normal case here in my example i don't have a outside
    reference. The CompletableFutures in the thread have never a chance getting completed. Normally the GC can safely erase both the thread and and the content in there, the CompetableFutures. But i don't think so, that this would happen?
  2. If I abbord this by throwing an exception -> the join() is never reached, then i think all would be erased by the GC?
  3. If I give one of the CompletableFutures to the outside by that AtomicReference, there then could be an chance to unblock the join(), There should be no GC here, until the unblock happens. BUT! the waiting myOuterThread on that join() doesen't have to to there anything more after the join(). So it could be an optimization erasing that Thread, before someone from outside completes B. But I think this would be also not happen?!

One more question here, how I can proof that behavior, if threads are blocked by waiting on a join() or are returned to a Thread-Pool?, where the Thread also "blocks"?


Solution

  • You seem to be struggling with different ways that CompletableFuture might leak, depending on how you created it. But it doesn't matter how, where, when or why it was created. The only thing that matters is whether or not it is still reachable.

    Will the Garbage Collector remove all of that?

    There are two places where we would expect there to be references to a CompletableFuture:

    • In the Runnable (or whatever) that would complete the future.
    • In any other code that would (at some point) attempt to get the eventual value from the future.

    If you have a call thenApplyAsync() or anyOf() then the reference Runnable is in the arguments to that call. If the call can still happen, then the reference to the Runnable must still be reachable.

    In your example:

    var notStartedCompletableFuture = new CompletableFuture<Object>();
    

    if the variable notStartedCompletableFuture is still accessible by some code that is still executing, then that CompletableFuture is reachable and won't be garbage collected.

    On the other hand, if notStartedCompletableFuture is no longer accessible, and if the future is no longer reachable by some other path, then it won't be reachable at all ... and will be a candidate for garbage collection.


    If there is a join() / get() at the end of that chain -> same question: Will the Garbage Collector remove all of that?

    That makes no difference. It is all based on reachability. (The only wrinkle is that a thread that is currently alive1 is always reachable, irrespective of any other references to its Thread object. The same applies to its Runnable, and other objects reachable from the Runnable.)

    But it is worth noting that if you call join() or get() on a thread / future that never terminates / completes, you will block the current thread, potentially for ever. And that is as bad as a thread leak.

    1 - A thread is "alive" from when it is started to when it terminates.


    When is the Thread-Executor signaled to schedule a new task?

    It depends what you mean by "schedule". If you mean, when is the task submitted, the answer is when submit is called. If you mean, when is it actually run ... well it goes into the queue, and it runs when it gets to the head of the queue and a worker thread is free to execute it.

    In the case of thenApplyAsync() and all_of(), the tasks are submitted (i.e. the submit(...) call occurs) when the respective method call occurs. So for example if thenApplyAsync is being called on the result of a previous call, then that call must return first.

    This is all a consequence of the basic properties of Java expression evaluation ... applied to the expression that you are using to construct the chain of stages.


    In general you don't need try / finally or try with resources to clean up potential memory leaks.

    All you need to do is to make sure that you don't keep references to the various futures, stages, etc in variables, data structures, etc that will remain accessible / reachable beyond the lifetime of your computation. If you do that ... those references are liable to be the source of the leaks.

    Thread leaks should not be your concern. If your code is not creating threads explicitly, they are being managed by the executor service / pool.