val model: org.apache.spark.ml.feature.Word2VecModel = new Word2Vec().setNumPartitions(20).setInputCol("value").setOutputCol("feature").fit(copus)
word2VecModel.save(s"$HDFS_URL/w2vmodel")
When I saves this Model then it is creating only single partition under data folder part-r-0000-988jdu-sduj76-jh433.snappy.parquet
with size 900 MB
val model: org.apache.spark.ml.feature.Word2VecModel =Word2VecModel.load("$HDFS_URL/w2vmodel")
So when I Am loading this Model then I Am Getting OutOfMemory
Exception
Is There Any way This model can be save with Multiple part of parquet or any thing else
I Am newbee so any Suggestion will be appreciated
Coincidentally this problem has been recently discussed on the developers lists and this discussion resulted in a JIRA ticket and pull request:
If you want a quick solution you can try to use MLlib implementation with Spark 2.0 or later (SPARK-11994).