Search code examples
pythonapache-sparknosenosetests

Running nosetests for pyspark


How would one run unit tests with nose for Apache Spark applications written in Python?

With nose one would usually just call the command

nosetests

to run the tests in the tests directory of a Python package. Pyspark scripts need to be run with the spark-submit command instead of the usual Python-executable to enable the import of the pyspark-module. How would I combine nosetests with pyspark to run tests for my Spark application?


Solution

  • If it helps we use nosetest to test sparkling pandas. We do a bit of magic in our utils file to add pyspark to the path based on the SPARK_HOME shell environment variable.