Search code examples
javamultithreadingspring-bootcron

Multithreading cron jobs in Spring Boot


I'm developing a Spring Boot application that looks for given keywords in given websites, and scraps the webpages if a match is found. I am writing a cron job to refresh the results every 5 minutes like the following:

@Scheduled(cron = "* */5 * * * *")
public void fetchLatestResults() throws Exception {
    LOG.debug("Fetching latest results >>>");
    List<Keyword> keywords = keywordService.findOldestSearched10();
    keywordService.updateLastSearchDate(keywords);
    searchResultService.fetchLatestResults(keywords);
    LOG.debug("<<< Latest results fetched");
}

The database has 100 keywords and in the cron job I'm first listing the oldest 10 keywords for which the results were last fetched. So, for example the first run should use keywords with ids 1 to 10 and the second run should use ids 11 to 20 and so on and the 11th run should again use ids 1 to 10 and the process continues.

Now, the problem is that executing the search takes much longer than 5 minutes. So, eventhough I've set the cron job to run every 5 minutes, the second run doesn't take place until the first is completed. As a result, completing the search is taking hours. How can I make this process multithreaded so that multiple instances of the cron job can be run simultaneously since they are operating on different list of keywords?


Solution

  • I suggest you make execution of your cron job asynchronous.

    Create executor class that'll create a new thread to run your cron job:

    @Component
    public class YourCronJobExecutor {
    
        private int threadsNumber = 10;
        private ExecutorService executorService;
    
        @PostConstruct
        private void init() {
            executorService = Executors.newFixedThreadPool(threadsNumber);
        }
    
        /**
         * Start.
         * @param runnable - runnable instance.
         */
        public void start(Runnable runnable) {
            try {
                executorService.execute(runnable);
            } catch (RejectedExecutionException e) {
                init();
                executorService.execute(runnable);
            }
        }
    }
    

    Create a processor class that will contain the logic of your cron job:

    @Component
    public class CronJobProcessor {
    
        //logger
        //autowired beans
    
        public void executeYouCronJob() {
            LOG.debug("Fetching latest results >>>");
            List<Keyword> keywords = keywordService.findOldestSearched10();
            keywordService.updateLastSearchDate(keywords);
            searchResultService.fetchLatestResults(keywords);
            LOG.debug("<<< Latest results fetched");
        }
    }
    

    And finally, your cron job class will look like this:

    @Component
    public class YourCronJobClass {
    
        private final YourCronJobExecutor yourCronJobExecutor;
    
        private final CronJobProcessor cronJobProcessor;
    
        @Autowired
        public PopulateCourseStateController(YourCronJobExecutor yourCronJobExecutor,
                                             CronJobProcessor cronJobProcessor) {
            this.yourCronJobExecutor = yourCronJobExecutor;
            this.cronJobProcessor = cronJobProcessor;
        }   
    
        @Scheduled(cron = "* */5 * * * *")
        public void fetchLatestResults() throws Exception {
            yourCronJobExecutor.start(cronJobProcessor::executeYouCronJob);
        }
    }
    

    This way execution of your cron job will take couple of milliseconds, and a separate thread, that'll actually be performing the job will run as long as it needs to.

    But perhaps, you'd want to execute search of every keyword in a separate thread, but that's a bit of a different story.