Search code examples
c#apache-sparkmobius

Facing difficulties in Installing Mobius in Window environment?


I am basically a .net programmer and I am tasked with analyzing data with SPARK and Cassandra. I was looking for a C# API to work with SPARK and I found out Mobius (as I don’t know JAVA). I started downloading the Mobius project, from GitHub and as per the build for windows I followed the steps mentioned and not able to get it work. I have the following questions regarding it.

1) I have a DataStax enterprise in an Ubuntu machine where my Cassandra and SPARK is available (Standalone). Now I would like to connect from my .NET project to the SPARK and then process the data in Cassandra. Is it possible for me to do that? I want to do that in Debug mode? I will be working using SPARK-SQL only as i am comfortable in SQL.

2) Is it a MUST to install SOLR and SPARK in my windows machine in order for Mobius to work? Will I be able to connect to CASSANDRA (Ubuntu machine) from the Windows SPARK and Mobius?

3) When I run the the command “sparkclr-submit.cmd debug” to get the value for CSharpBackendPortNumber, I am getting an error that “load-spark-env.cmd” is missing. Where can I find this file and how will I get the value for CSharpBackendPortNumber? Is it necessary to have SPARK in my windows machine?


Solution

    1. Using Windows client to connect to YARN-based Spark cluster in Linux is a supported/validated scenario for Mobius. I have never tried using a Windows client for Mobius with Standalone Linux-based Spark cluster. I recommend using a Linux machine as Mobius client to verify a basic functionality in Mobius first.

    2. Mobius does not need Solr. You should be able to use Mobius to connect to Cassandra deployed in any OS.

    3. load-spark-env.cmd is a part of the Spark release. You need to set SPARK_HOME environment variable before running sparkclr-submit.cmd