Search code examples
javaexecutor

Java Executor Best Practices for Tasks that Should Run Forever


I'm working on a Java project where I need to have multiple tasks running asynchronously. I'm led to believe Executor is the best way for me to do this, so I'm familiarizing myself with it. (Yay getting paid to learn!) However, it's not clear to me what the best way is to accomplish what I'm trying to do.

For the sake of argument, let's say I have two tasks running. Neither is expected to terminate, and both should run for the duration of the application's life. I'm trying to write a main wrapper class such that:

  • If either task throws an exception, the wrapper will catch it and restart the task.
  • If either task runs to completion, the wrapper will notice and restart the task.

Now, it should be noted that the implementation for both tasks will wrap the code in run() in an infinite loop that will never run to completion, with a try/catch block that should handle all runtime exceptions without disrupting the loop. I'm trying to add another layer of certainty; if either I or somebody who follows me does something stupid that defeats these safeguards and halts the task, the application needs to react appropriately.

Is there a best practice for approaching this problem that folks more experienced than me would recommend?

FWIW, I've whipped-up this test class:


public class ExecTest {

   private static ExecutorService executor = null;
   private static Future results1 = null;
   private static Future results2 = null;

   public static void main(String[] args) {
      executor = Executors.newFixedThreadPool(2);
      while(true) {
         try {
            checkTasks();
            Thread.sleep(1000);
         }
         catch (Exception e) {
            System.err.println("Caught exception: " + e.getMessage());
         }
      }
   }

   private static void checkTasks() throws Exception{
      if (results1 == null || results1.isDone() || results1.isCancelled()) {
         results1 = executor.submit(new Test1());
      }

      if (results2 == null || results2.isDone() || results2.isCancelled()) {
         results2 = executor.submit(new Test2());
      }
   }
}

class Test1 implements Runnable {
   public void run() {
      while(true) {
         System.out.println("I'm test class 1");
         try {Thread.sleep(1000);} catch (Exception e) {}
      }

   }
}

class Test2 implements Runnable {
   public void run() {
      while(true) {
         System.out.println("I'm test class 2");
         try {Thread.sleep(1000);} catch (Exception e) {}
      }
   }
}

It's behaving the way I want, but I don't know if there are any gotchas, inefficiencies, or downright wrong-headedness waiting to surprise me. (In fact, given that I'm new to this, I'd be shocked if there wasn't something wrong/inadvisable about it.)

Any insight is welcomed.


Solution

  • I faced a similar situation in my previous project, and after my code blew in the face of an angry customer, my buddies and I added two big safe-guards:

    1. In the infinite loop, catch Errors too, not just exceptions. Sometimes unexcepted things happen and Java throws an Error at you, not an Exception.
    2. Use a back-off switch, so if something goes wrong and is non-recoverable, you don't escalate the situation by eagerly starting another loop. Instead, you need to wait until the situation goes back to normal and then start again.

    For example, we had a situation where the database went down and during the loop an SQLException was thrown. The unfortunate result was that the code went through the loop again, only to hit the same exception again, and so forth. The logs showed that we hit the same SQLException about 300 times in a second!! ... this happened intermittently several times with occassional JVM pauses of 5 seconds or so, during which the application was not responsive, until eventually an Error was thrown and the thread died!

    So we implemented a back-off strategy, approximately shown in the code below, that if the exception is not recoverable (or is excepted to recover within a matter of minutes), then we wait for a longer time before resuming operations.

    class Test1 implements Runnable {
      public void run() {
        boolean backoff = false;
        while(true) {
          if (backoff) {
            Thread.sleep (TIME_FOR_LONGER_BREAK);
            backoff = false;
          }
          System.out.println("I'm test class 1");
          try {
            // do important stuff here, use database and other critical resources
          }
          catch (SqlException se) {
           // code to delay the next loop
           backoff = true;
          }
          catch (Exception e) {
          }
          catch (Throwable t) {
          }
        }
      }
    }
    

    If you implement your tasks this way then I don't see a point in having a third "watch-dog" thread with the checkTasks() method. Furthermore, for the same reasons I outlined above, I'd be cautious to just start the task again with the executor. First you need to understand why the task failed and whether the environment is in a stable condition that running the task again would be useful.