I am trying to deploy PySpark locally using the instructions at
https://spark.apache.org/docs/latest/api/python/getting_started/install.html#using-pypi
I can see that extra dependencies are available, such as sql and pandas_on_spark that can be deployed with
pip install pyspark[sql,pandas_on_spark]
But how can we find all available extras?
Looking in the json of the pyspark package (based on https://wiki.python.org/moin/PyPIJSON)
https://pypi.org/pypi/pyspark/json
I could not find the possible extra dependencies (as described in What is 'extra' in pypi dependency?); the value for requires_dist is null.
Many thanks for your help.
As far as I know, you can not easily get the list of extras. If this list is not clearly documented, then you will have to look at the code/config for the packaging. In this case, here which gives the following list: ml
, mllib
, sql
, and pandas_on_spark
.