Search code examples
azureapache-sparkazure-blob-storageazure-storagesas-token

SAS token of azure container generated by c# client is not permitted to getAccessControl/setAccessControl


I have a Spark application and I want to access an Azure Blob container by writing the event log into the blob container.

I want to authenticate using a SAS token. The SAS token generated by the Azure portal works fine. However, the one generated by the C# client does not work. I dont know what's the difference between these two SAS token.

This is how I generate the SAS token in Azure portal

enter image description here

This is my spark conf

    spark.eventLog.dir: "abfss://[email protected]/log"
    spark.hadoop.fs.azure.account.auth.type.lydevstorage0.dfs.core.windows.net: "SAS"
    spark.hadoop.fs.azure.sas.fixed.token.lydevstorage0.dfs.core.windows.net: ""
    spark.hadoop.fs.azure.sas.token.provider.type.lydevstorage0.dfs.core.windows.net: "org.apache.hadoop.fs.azurebfs.sas.FixedSASTokenProvider"

This is C# code:

            BlobSasBuilder blobSasBuilder = new BlobSasBuilder()
            {
                StartsOn = DateTimeOffset.UtcNow.AddDays(-1),
                ExpiresOn = DateTimeOffset.UtcNow.AddDays(1),
                Protocol = SasProtocol.HttpsAndHttp,
                BlobContainerName = "sparkevent",
                Resource = "b" // I also tried "c"
            };
            blobSasBuilder.SetPermissions(BlobContainerSasPermissions.All);

            string sasToken2 = blobSasBuilder.ToSasQueryParameters(new StorageSharedKeyCredential("lydevstorage0", <access key>)).ToString();

The error is

Exception in thread "main" java.nio.file.AccessDeniedException: Operation failed: "Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.", 403, HEAD, https://lydevstorage0.dfs.core.windows.net/sparkevent/?upn=false&action=getAccessControl&ti
meout=90&sv=2021-02-12&spr=https,http&st=2023-06-26T03:33:27Z&se=2023-06-28T03:33:27Z&sr=c&sp=racwdxlti&sig=XXXXX
        at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.checkException(AzureBlobFileSystem.java:1384)
        at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.getFileStatus(AzureBlobFileSystem.java:611)
        at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.getFileStatus(AzureBlobFileSystem.java:599)
        at org.apache.spark.deploy.history.EventLogFileWriter.requireLogBaseDirAsDirectory(EventLogFileWriters.scala:77)
        at org.apache.spark.deploy.history.SingleEventLogFileWriter.start(EventLogFileWriters.scala:221)
        at org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:83)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:612)
        at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2704)
        at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:953)
        at scala.Option.getOrElse(Option.scala:189)
        at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:947)
        at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:30)
        at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: Operation failed: "Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.", 403, HEAD, https://lydevstorage0.dfs.core.windows.net/sparkevent/?upn=false&action=getAccessControl&timeout=90&sv=2021-02-12&spr=https,http&st=2023-06-26T0
3:33:27Z&se=2023-06-28T03:33:27Z&sr=c&sp=racwdxlti&sig=XXXXX
        at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.completeExecute(AbfsRestOperation.java:231)
        at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.lambda$execute$0(AbfsRestOperation.java:191)
        at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDurationOfInvocation(IOStatisticsBinding.java:464)
        at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.execute(AbfsRestOperation.java:189)
        at org.apache.hadoop.fs.azurebfs.services.AbfsClient.getAclStatus(AbfsClient.java:911)
        at org.apache.hadoop.fs.azurebfs.services.AbfsClient.getAclStatus(AbfsClient.java:892)
        at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.getIsNamespaceEnabled(AzureBlobFileSystemStore.java:358)
        at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.getFileStatus(AzureBlobFileSystemStore.java:932)
        at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.getFileStatus(AzureBlobFileSystem.java:609)
        ... 23 more

I tried the SAS token generated in Azure portal, it worked fine.


Solution

  • If you are using Data-lake-gen2 account with a hierarchical namespace, you can use the Datalake package with the below code to create a SAS token using C#.

    Code:

    using Azure.Storage;
    using Azure.Storage.Files.DataLake;
    using Azure.Storage.Sas;
    
    namespace SAStoken
    {
        class Program
        {
            private static void Main()
            {
                var AccountName = "venkat098";
                var AccountKey = "";
                var FileSystemName = "filesystem1";
                StorageSharedKeyCredential key = new StorageSharedKeyCredential(AccountName, AccountKey);
                string dfsUri = "https://" + AccountName + ".dfs.core.windows.net";
                var dataLakeServiceClient = new DataLakeServiceClient(new Uri(dfsUri), key);
                var directoryclient = dataLakeServiceClient.GetFileSystemClient(FileSystemName);
                DataLakeSasBuilder sas = new DataLakeSasBuilder()
                {
                    FileSystemName = FileSystemName,//container name
                    Resource = "d",
                    IsDirectory = true,
                    ExpiresOn = DateTimeOffset.UtcNow.AddDays(7),
                    Protocol = SasProtocol.HttpsAndHttp,
                };
                sas.SetPermissions(DataLakeAccountSasPermissions.All);
                Uri sasUri = directoryclient.GenerateSasUri(sas);
                Console.WriteLine(sasUri);
            }
    
        }
    }
    

    Output:

    https://venkat098.dfs.core.windows.net/filesystem1?sv=2022-11-02&spr=https,http&se=2023-07-04T05%3A53%3A39Z&sr=c&sp=racwdl&sig=xxxxxx
    

    enter image description here

    I checked the URL with the image file it working successfully.

    https://venkat098.dfs.core.windows.net/filesystem1/cell_division.jpeg?sv=2022-11-02&spr=https,http&se=2023-07-04T05%3A53%3A39Z&sr=c&sp=racwdl&sig=xxxxx
    

    Browser: enter image description here

    Reference:

    Use .NET to manage data in Azure Data Lake Storage Gen2 - Azure Storage | Microsoft Learn