Run a spark command automatically

I have an object in spark scala that reads an HDFS file and export it in a local file, within my cluster. I created the function, with an object, I created a SparkSession and the function correctly returns what I want with the following command:

ReadFiles.main(Array("hdfs://.../info.log"))

But I wanted this function to run every 5 minutes. Is there a way to execute the command every 5 minutes? Or else create some variable in SparkSession function that does?

Thanks

Solution

You can go ahead with threads as below.

import java.util.concurrent.Executors
import java.util.concurrent.TimeUnit.SECONDS

Executors.newSingleThreadScheduledExecutor.scheduleWithFixedDelay(fileReaderThread(), 0L, 300L, SECONDS)

  def fileReaderThread() = new Runnable {
    override def run(): Unit = {
      ReadFiles.main(Array("hdfs://.../info.log"))
    }
  }

Call newSingleThreadScheduledExecutor in a separate main only once. Later it will keep on calling your read files method in a fixed time.