Search code examples
amazon-web-servicesapiflaskpysparkamazon-emr

Can't reach flask in Spark master node using Amazon EMR


I want to understand if it's possible to use flask application connected to Spark master node implemented in Amazon EMR. The goal is to call Flask from a web app to retrieve spark outputs. Ports are open in amazon EMR cluster's security group but I can't reach it from outside on his port.

What do you think about it? Are there any other solutions?


Solution

  • While it is totally possible to call Flask (or anything) running on EMR, depending on what you are doing you might find Apache Livy handy. The good thing is Livy is fully supported by EMR. You can use Livy to submit jobs and to retrieve results synchronously or asynchronously. It gives you a rest API to interact with Spark.