Search code examples
htmlpysparkdatabricksazure-databricksazure-blob-storage

Set content type when uploading to Azure Blob Storage via Databricks


I am uploading a static site using the databricks platform specifically using the below command for pushing html content to a location.

dbutils.fs.put("/mnt/$web/index.html", html, overwrite=True)

This is working but the HTML file is downloading instead of displaying. This is because the content type is wrong: Content-Type: application/octet-stream.

Is there any way to set this using databricks ?


Solution

  • Finally, this code worked for me. First, I am getting connection string from databricks scope as

    dbutils.secrets.get(scope = "generic-scope", key = "website-key") 
    

    If you don't have it then look for it inside Storage Account's Container Access Key

    access location in azure storage account

    from azure.storage.blob import BlobServiceClient, ContentSettings
    connect_str="connectionString"
    blob_service_client = BlobServiceClient.from_connection_string(connect_str)
    
    # Instantiate a ContainerClient
    container_client = blob_service_client.get_container_client("$web")
    
    # List files in blob folder
    blobs_list = container_client.list_blobs()
    for blob in blobs_list:
        print(blob.content_settings.content_type) # application/octet-stream
        blob.set_http_headers(
        content_settings=ContentSettings(
            content_type="text/html; charset=utf-8"
            )
        )