I am trying to connect to s3 provided by minio using spark But it is saying the bucket minikube does not exists. (created bucket already)
val spark = SparkSession.builder().appName("AliceProcessingTwentyDotTwo")
.config("spark.serializer", "org.apache.spark.serializer.KryoSerializer").master("local[1]")
val sc= spark.sparkContext
sc.hadoopConfiguration.set("fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem")
sc.hadoopConfiguration.set("fs.s3a.endpoint", "http://localhost:9000")
sc.hadoopConfiguration.set("fs.s3a.access.key", "minioadmin")
sc.hadoopConfiguration.set("fs.s3a.secret.key", "minioadmin")
sc.hadoopConfiguration.set("fs.s3`a`.path.style.access", "true")
I am using the following guide to connect.
These are the dependencies I used in scala.
"org.apache.spark" %% "spark-core" % "2.4.0", "org.apache.spark" %% "spark-sql" % "2.4.0", "com.amazonaws" % "aws-java-sdk" % "1.11.712", "org.apache.hadoop" % "hadoop-aws" % "2.7.3",
Try spark 2.4.3 without hadoop and use Hadoop 2.8.2 or 3.1.2. After trying steps in below link I am able to connect minio using cli