Trying to load image from folder in pyspark
from pyspark.ml.image import ImageSchema
from pyspark.sql.functions import lit
zero_df = ImageSchema.readImages('../Transfer-Learning-
PySpark/images/o').withColumn("label",lit(0))
throws error
AttributeError Traceback (most recent call last)
<ipython-input-9-29c9b120f9c2> in <module>
2 from pyspark.sql.functions import lit
3
----> 4 zero_df = ImageSchema.readImages('../Transfer-Learning-
PySpark/images/o').withColumn("label",lit(0))
AttributeError: '_ImageSchema' object has no attribute 'readImages'
Python 3.8 Spark v3.0.2
Since Spark 2.4, images can be loaded directly with a DataFrameReader using the format image
:
zero_df = spark.read.format("image").load(<path to files>)
More details can be found here.
The usage of ImageSchema.readImages
as been deprecated since then and the method has been removed in Spark 3.0.0