Search code examples
apache-sparkpysparkapache-spark-mllib

is pyspark MLlib still maintained developed?


I the 2017 book "learning pyspark" one can read:

Even though MLlib is now in a maintenance mode, that is, it not actively being developed (and will be most likely be deprecated later) ...

So I wanted to know if there is any update about the situation. Is MLlib still maintained? Will it be deprecated soon?


Solution

  • As of Spark 2.0, the RDD-based APIs in the spark.mllib package have entered maintenance mode. The primary Machine Learning API for Spark is now the DataFrame-based API in the spark.ml package.

    Taken from https://spark.apache.org/docs/latest/ml-guide.html

    So, MLlib is in "maintenance mode", and spark.ml is here as a replacement.