I am trying to:
addFile
so all nodes see the file.Step 1 and Step 2 are working successfully. I can open the file with with open(cert_filepath, "r") as file
and print it.
But in Step 3 I get the following error:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3) (10.239.124.48 executor 1): com.ibm.db2.jcc.am.DisconnectNonTransientConnectionException: [jcc][t4][2043][11550][4.27.25] Exception java.io.FileNotFoundException: Error opening socket to server / on port 53,101 with message: /local_disk0/spark-c30d1a7f-deea-4184-a9f9-2b6c9eab6c5e/userFiles-761620cf-2cb1-4623-9677-68694f0e4b3c/dwt01_db2_ssl.arm (No such file or directory). ERRORCODE=-4499, SQLSTATE=08001
The port is 53101
but the error code puts a comma.
The essential part of code:
sc = SparkContext.getOrCreate()
s3_client = boto3.client("s3")
s3_client.download_file(
Bucket="my-bucket-s3",
Key="my/path/db2/dwt01_db2_ssl.arm",
Filename="dwt01_db2_ssl.arm",
)
sc.addFile("dwt01_db2_ssl.arm")
cert_filepath = SparkFiles.get("dwt01_db2_ssl.arm")
user_name = cls.get_aws_secret(secret_name=cls._db2_username_aws_secret_name, key="username", region="eu-central-1")
password = cls.get_aws_secret(secret_name=cls._db2_password_aws_secret_name, key="password", region="eu-central-1")
driver = "com.ibm.db2.jcc.DB2Driver"
jdbc_url = f"jdbc:db2://{hostname}:{port}/{database}:sslConnection=true;sslCertLocation={cert_filepath};"
df = (
spark.read.format("jdbc")
.option("driver", driver)
.option("url", jdbc_url)
.option("dbtable", f"({query}) as src")
.option("user", user_name)
.option("password", password)
.load()
)
I can't seem to solve what could be the cause for this FileNotFoundException
as it is there for at least when reading it and printing it.
You're getting the exception java.io.FileNotFoundException
because of the reason mentioned in this comment:
In cluster mode, a local file, which has not been added to the spark-submit will not be found via addFile. This is because the driver (application master) is started on the cluster and is already running when he reaches the addFile call. It is too late at this point. The application has already been submitted, and the local file system is the file system of a specific cluster node.
So, to overcome this issue I'll suggest you use a DBFS path (eg: /dbfs/dwt01_db2_ssl.arm
) for consistency and access across workers.
Here's your updated code:
# sc = SparkContext.getOrCreate()
s3_client = boto3.client("s3")
s3_client.download_file(
Bucket="my-bucket-s3",
Key="my/path/db2/dwt01_db2_ssl.arm",
Filename="dwt01_db2_ssl.arm",
)
# sc.addFile("dwt01_db2_ssl.arm")
# cert_filepath = SparkFiles.get("dwt01_db2_ssl.arm")
cert_filepath = "/dbfs/dwt01_db2_ssl.arm"
Afterward, you'll have to configure spark SSL properties as follows to continue because your code is missing this currently:
spark.conf.set("spark.ssl", "true")
spark.conf.set("spark.ssl.trustStore", cert_filepath)
And then the rest of the code will be as follows:
user_name = cls.get_aws_secret(secret_name=cls._db2_username_aws_secret_name, key="username", region="eu-central-1")
password = cls.get_aws_secret(secret_name=cls._db2_password_aws_secret_name, key="password", region="eu-central-1")
driver = "com.ibm.db2.jcc.DB2Driver"
jdbc_url = f"jdbc:db2://{hostname}:{port}/{database}:sslConnection=true;"
df = (
spark.read.format("jdbc")
.option("driver", driver)
.option("url", jdbc_url)
.option("dbtable", f"({query}) as src")
.option("user", user_name)
.option("password", password)
.load()
)
Note: If you notice, I have removed sslCertLocation
in jdbc_url
because I don't think it's needed anymore as we have already configured spark SSL properties to handle that.
Now, doing the above changes should ideally resolve the error.
I suspect another issue might occur because of the .arm
file as it's not a standard trust store format. But let me know if you face any problems.