Search code examples
scalafunctionapache-sparkconfigtypesafe-config

How to invoke Spark functions (with arguments) from applications.properties(config file)?


So, I have a typesafe config file named application.properties which contains certain values like:

dev.execution.mode = local
dev.input.base.dir = /Users/debaprc/Documents/QualityCheck/Data
dev.schema.lis = asin StringType,subs_activity_date DateType,marketplace_id DecimalType

I have used these values as Strings in my Spark code like:

def main(args: Array[String]): Unit = {
    val props = ConfigFactory.load()
    val envProps = props.getConfig("dev")

    val spark = SparkSession.builder.appName("DataQualityCheckSession")
      .config("spark.master", envProps.getString("execution.mode"))
      .getOrCreate()

Now I have certain functions defined in my spark code (func1, func2, etc...). I want to specify which functions are to be called, along with the respective arguments, in my application.properties file. Something like this:

dev.functions.lis = func1,func2,func2,func3
dev.func1.arg1.lis = arg1,arg2
dev.func2.arg1.lis = arg3,arg4,arg5
dev.func2.arg2.lis = arg6,arg7,arg8
dev.func3.arg1.lis = arg9,arg10,arg11,arg12

Now, once I specify these, what do I do in Spark, to call the functions with the provided arguments? Or do I need to specify the functions and arguments in a different way?


Solution

  • I agree with @cchantep the approach seems wrong. But if you still want to do something like that, I would decouple the function names in the properties file from the actual functions/methods in your code.

    I have tried this and worked fine:

    def function1(args: String): Unit = {
      println(s"func1 args: $args")
    }
    
    def function2(args: String): Unit = {
      println(s"func2 args: $args")
    }
    
    val functionMapper: Map[String, String => Unit] = Map(
      "func1" -> function1,
      "func2" -> function2
    )
    
    val args = "arg1,arg2"
    
    functionMapper("func1")(args)
    functionMapper("func2")(args)
    

    Output:

    func1 args: arg1,arg2
    func2 args: arg1,arg2
    

    Edited: Simpler approach with output example.