Search code examples
google-cloud-storagegoogle-cloud-dataproc

Error Connecting to GCS using Private Keys


Scenario is that we have Project1 from where we are trying to access Project2 GCS. We are passing private key of project 2 to SparkSession and job is running in project 1 but it is giving Invalid PKCS8 data.

Dataproc version - 1.4

session.sparkContext().hadoopConfiguration().set("fs.gs.auth.service.account.private.key.id","<private-key-id>");
session.sparkContext().hadoopConfiguration().set("fs.gs.auth.service.account.private.key",<private-key>");
session.sparkContext().hadoopConfiguration().set("fs.gs.auth.service.account.email","<client-email>");

ERROR:

2022-02-17T16:19:09.231359147Z DEFAULT Invalid PKCS8 data.   at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.util.CredentialFactory.privateKeyFromPkcs8(CredentialFactory.java:346)    at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.util.CredentialFactory.getCredentialsFromSAParameters(CredentialFactory.java:310)   at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.util.CredentialFactory.getCredential(CredentialFactory.java:393)   at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.getCredential(GoogleHadoopFileSystemBase.java:1324)    at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.createGcsFs(GoogleHadoopFileSystemBase.java:1459) at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.configure(GoogleHadoopFileSystemBase.java:1443)  at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.initialize(GoogleHadoopFileSystemBase.java:467)  at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3242)    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:121)   at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3291)   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3259)   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:470)   at com.gcp.util.Day2Util.deleteGCSPartFile(Day2Util.java:430)    at com.gcp.ReadGCSWithSA.main(ReadGCSWithSA.java:42)    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)    at java.lang.reflect.Method.invoke(Method.java:498)   at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:855)   at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)    at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)  at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)  at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:939) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:948)   at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Please let me know if there is any other way to pass the SA details. Please note we don't have the access to pass service account credential file.


Solution

  • It worked fine with above properties. Problem was I removed -----BEGIN PRIVATE KEY----- and -----END PRIVATE KEY----- from private_key earlier hence it was not working