Search code examples
apache-sparkdatabricksazure-databricks

Authorization Header issue (to Blob Storage or ADLS1 or ADLS2) in Databricks / AZURE


Code supplied by Databricks on cousera DP203, Databricks Managed Service on AZURE:

%fs head /mnt/training/wikipedia/pageviews/pageviews_by_second.tsv

It does not work. It gives:

AzureException: hadoop_azure_shaded.com.microsoft.azure.storage.StorageException: Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
Caused by: StorageException: Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.

Why would they - Databricks - leave that so and not upgrade the access path in an appropriate manner?

  • I cannot do too much in security on AZURE in terms Access Keys.
  • SAS possibly, but nothing is stated as needing to be done. Why not state that if something needs to be done? Omission?
  • Can find them by running display(dbutils.fs.ls('/databricks-datasets/wikipedia-datasets/data-001/pageviews/raw/')) and %fs head /databricks-datasets/wikipedia-datasets/data-001/pageviews/raw/pageviews_by_second.tsv

What I did do also was as per follows, where you can see that I made a change to get the spark.read to work:

enter image description here But wasbs is deprecated.

So, what is the idea here? As I can also do csvFile = "/databricks-datasets/wikipedia-datasets/data-001/pageviews/raw/pageviews_by_second.tsv" That works fine.

In Databricks Community Edition I can run the query against /mnt/training/... fine.

I am wondering what to conclude. Lack of updating Databricks after migrating to ADLS2 possibly, moving, changing mnt points? I am not expecting that at this stage of the course ABFS is needed for creating mount points.


Solution

  • AzureException: hadoop_azure_shaded.com.microsoft.azure.storage.StorageException: Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature. Caused by: StorageException: Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.


    The above error can case because of invalid SAS token is used in creating mount point or the SAS token which is used is got terminated.

    When you connect to ADLS1 or ADLS2 they use different endpoints to connect ADLS1 use adl and ADLS 2 use wasbs,abfss

    You need to update the mount point for particular location.

    dbutils.fs.mount(
        source="wasbs://cont_name@storage_acc_name.blob.core.windows.net",
        mount_point="/mnt/blob1",  # Update the mount_point parameter to start with a forward slash
        extra_configs={'fs.azure.sas.cont_name.storage_acc_name.blob.core.windows.net': 'SAS token'}
    )
    
    %fs head /mnt/blob/filepath
    

    enter image description here