I have a Spark application and I want to access an Azure Blob container by writing the event log into the blob container.
I want to authenticate using a SAS token. The SAS token generated by the Azure portal works fine. However, the one generated by the C# client does not work. I dont know what's the difference between these two SAS token.
This is how I generate the SAS token in Azure portal
This is my spark conf
spark.eventLog.dir: "abfss://sparkevent@lydevstorage0.dfs.core.windows.net/log"
spark.hadoop.fs.azure.account.auth.type.lydevstorage0.dfs.core.windows.net: "SAS"
spark.hadoop.fs.azure.sas.fixed.token.lydevstorage0.dfs.core.windows.net: ""
spark.hadoop.fs.azure.sas.token.provider.type.lydevstorage0.dfs.core.windows.net: "org.apache.hadoop.fs.azurebfs.sas.FixedSASTokenProvider"
This is C# code:
BlobSasBuilder blobSasBuilder = new BlobSasBuilder()
{
StartsOn = DateTimeOffset.UtcNow.AddDays(-1),
ExpiresOn = DateTimeOffset.UtcNow.AddDays(1),
Protocol = SasProtocol.HttpsAndHttp,
BlobContainerName = "sparkevent",
Resource = "b" // I also tried "c"
};
blobSasBuilder.SetPermissions(BlobContainerSasPermissions.All);
string sasToken2 = blobSasBuilder.ToSasQueryParameters(new StorageSharedKeyCredential("lydevstorage0", <access key>)).ToString();
The error is
Exception in thread "main" java.nio.file.AccessDeniedException: Operation failed: "Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.", 403, HEAD, https://lydevstorage0.dfs.core.windows.net/sparkevent/?upn=false&action=getAccessControl&ti
meout=90&sv=2021-02-12&spr=https,http&st=2023-06-26T03:33:27Z&se=2023-06-28T03:33:27Z&sr=c&sp=racwdxlti&sig=XXXXX
at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.checkException(AzureBlobFileSystem.java:1384)
at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.getFileStatus(AzureBlobFileSystem.java:611)
at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.getFileStatus(AzureBlobFileSystem.java:599)
at org.apache.spark.deploy.history.EventLogFileWriter.requireLogBaseDirAsDirectory(EventLogFileWriters.scala:77)
at org.apache.spark.deploy.history.SingleEventLogFileWriter.start(EventLogFileWriters.scala:221)
at org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:83)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:612)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2704)
at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:953)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:947)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:30)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: Operation failed: "Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.", 403, HEAD, https://lydevstorage0.dfs.core.windows.net/sparkevent/?upn=false&action=getAccessControl&timeout=90&sv=2021-02-12&spr=https,http&st=2023-06-26T0
3:33:27Z&se=2023-06-28T03:33:27Z&sr=c&sp=racwdxlti&sig=XXXXX
at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.completeExecute(AbfsRestOperation.java:231)
at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.lambda$execute$0(AbfsRestOperation.java:191)
at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDurationOfInvocation(IOStatisticsBinding.java:464)
at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.execute(AbfsRestOperation.java:189)
at org.apache.hadoop.fs.azurebfs.services.AbfsClient.getAclStatus(AbfsClient.java:911)
at org.apache.hadoop.fs.azurebfs.services.AbfsClient.getAclStatus(AbfsClient.java:892)
at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.getIsNamespaceEnabled(AzureBlobFileSystemStore.java:358)
at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.getFileStatus(AzureBlobFileSystemStore.java:932)
at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.getFileStatus(AzureBlobFileSystem.java:609)
... 23 more
I tried the SAS token generated in Azure portal, it worked fine.
If you are using Data-lake-gen2 account
with a hierarchical namespace, you can use the Datalake package with the below code to create a SAS token using C#.
Code:
using Azure.Storage;
using Azure.Storage.Files.DataLake;
using Azure.Storage.Sas;
namespace SAStoken
{
class Program
{
private static void Main()
{
var AccountName = "venkat098";
var AccountKey = "";
var FileSystemName = "filesystem1";
StorageSharedKeyCredential key = new StorageSharedKeyCredential(AccountName, AccountKey);
string dfsUri = "https://" + AccountName + ".dfs.core.windows.net";
var dataLakeServiceClient = new DataLakeServiceClient(new Uri(dfsUri), key);
var directoryclient = dataLakeServiceClient.GetFileSystemClient(FileSystemName);
DataLakeSasBuilder sas = new DataLakeSasBuilder()
{
FileSystemName = FileSystemName,//container name
Resource = "d",
IsDirectory = true,
ExpiresOn = DateTimeOffset.UtcNow.AddDays(7),
Protocol = SasProtocol.HttpsAndHttp,
};
sas.SetPermissions(DataLakeAccountSasPermissions.All);
Uri sasUri = directoryclient.GenerateSasUri(sas);
Console.WriteLine(sasUri);
}
}
}
Output:
https://venkat098.dfs.core.windows.net/filesystem1?sv=2022-11-02&spr=https,http&se=2023-07-04T05%3A53%3A39Z&sr=c&sp=racwdl&sig=xxxxxx
I checked the URL with the image file it working successfully.
https://venkat098.dfs.core.windows.net/filesystem1/cell_division.jpeg?sv=2022-11-02&spr=https,http&se=2023-07-04T05%3A53%3A39Z&sr=c&sp=racwdl&sig=xxxxx
Browser:
Reference:
Use .NET to manage data in Azure Data Lake Storage Gen2 - Azure Storage | Microsoft Learn