I can find the answer to this in Java, but so far I haven't seen a Python solution so I'm posting this question.
In my log4j.properties, I have:
log4j.rootLogger=WARN,LOGFILE
log4j.appender.LOGFILE=org.apache.log4j.RollingFileAppender
log4j.appender.LOGFILE.File=log/${scriptname}.log
log4j.appender.LOGFILE.layout=org.apache.log4j.PatternLayout
log4j.appender.LOGFILE.Append=false
log4j.appender.LOGFILE.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n
In script.py in my main, I call a method to launch spark:
spark_submit(yarn_pool, os.path.basename(__file__))
Which is defined here:
def spark_submit(yarn_pool, scriptname):
spark_submit_command = 'spark2-submit'
ret_code = subprocess.call([
spark_submit_command,
'--master', 'yarn',
'--queue', yarn_pool,
'--executor-memory', '16g',
'--driver-java-options', f'-Dlog4j.debug=true -Dlogfile.name={scriptname}'
Late in script.py, I attempt the logging:
conf = SparkConf()
conf.setAppName("My App")
spark = SparkContext(conf=conf)
log4jLogger = spark._jvm.org.apache.log4j
LOGGER = log4jLogger.LogManager.getLogger("root.logger")
LOGGER.warn("Starting App")
I'm trying to find a way to pass the filename of my script into the spark_submit method, and then into log4j.properties, but I cannot figure out the syntax to get the code to actually recognize my scriptname parameter.
I've tried ${sys:scriptname} and ${env:scriptname} as well, and those are also unrecognized. There doesn't seem to be clear documentation on how variables through all these files are passed together, and I'd appreciate help in understanding this.
The ${...}
variables in the log4j.properties
file are expanded using Java system properties.
So if, in your log4j.properties
file, you have
log4j.appender.LOGFILE.File=log/${scriptname}.log
you should be able to provide a value for scriptname
using
f'-Dscriptname={scriptname}'