I am running a Pyspark AWS Glue Job that includes a Python UDF. In the logs I see this line repeated.
INFO [Executor task launch worker for task 15765] python.PythonUDFRunner (Logging.scala:logInfo(54)):
Times: total = 268103, boot = 21, init = 2187, finish = 265895
Does anyone know what this logInfo (total/boot/init/finish) means??
I have looked at the Spark code and I am none the wiser and there isn't a mention of this info anywhere else I have looked for
Ok so this is what it all means:
Now hopefully it makes more sense.
And remember: if possible do not use Python UDFs but try to create a PandasUDF instead.