I have a team where many member has permission to submit Spark tasks to YARN (the resource management) by command line. It's hard to track who is using how much cores, who is using how much memory...e.g. Now I'm looking for a software, framework or something could help me monitor the parameters that each member used. It will be a bridge between client and YARN. Then I could used it to filter the submit commands.
I did take a look at mlflow and I really like the MLFlow Tracking but it was designed for ML training process. I wonder if there is an alternative for my purpose? Or there is any other solution for the problem.
Thank you!
My recommendation would be to build such a tool yourself as its not too complicated, have a wrapper script to spark submit which logs the usage in a DB and after the spark job finishes the wrapper will know to release information. could be done really easily. In addition you can even block new spark submits if your team already asked for too much information.
And as you build it your self its really flexible as you can even create "sub teams" or anything you want.