Is it possible to save a Spark ML pipeline to a database (Cassandra for example)? From the documentation I can only see the save to path option:
myMLWritable.save(toPath);
Is there a way to somehow wrap or change the myMLWritable.write()
MLWriter instance and redirect the output to the database?
It is not possible (or at least no supported) at this moment. ML writer
is not extendable
and depends on Parquet
files and directory structure to represent models
.
Technically speaking you can extract individual components and use internal private API
to recreate models
from scratch, but it is likely the only option.