I am submitting a script to spark-submit and passing it a file using --files property. Later on I need to read it in a worker.
I don't understand what API I should use to do that. I figured I'd try just:
with open('myfile'):
but this did not work.
I am able to pass the file using the addFile mechanism but it may not be good enough for me.
This may seem like a very simple question but I did not find any comprehensive documentation on spark-submit. The docs sure doen't cover it.
Well, this is embarrassing. I forgot to look inside spark-submit --help. And this is what it says:
--files FILES Comma-separated list of files to be placed in the working
directory of each executor. File paths of these files
in executors can be accessed via SparkFiles.get(fileName).
Sometimes it's right under ones own nose..