Search code examples
capache-sparkdistributed-computing

How to run C algorithm on Spark cluster?


I have an algorithm that is written in C.

I want to use Spark to run it on a cluster of nodes. How can I run my C executable on Spark? I have read about JNI and Java Process to execute my binary.


Solution

  • Here is a nice article from databricks on how we can run c and c++ codes in apache spark.

    https://kb.databricks.com/_static/notebooks/scala/run-c-plus-plus-scala.html

    After moving the c/c++ codes to the underlying distributed file system, You can compile the code and run on spark worker nodes.