Search code examples
hadoophdfsazure-storageazure-blob-storageazure-storage-emulator

Access Azure Storage Emulator through hadoop FileSystem api


I have a scala codebase where i am accessing azure blob files using Hadoop FileSystem Apis (and not the azure blob web client). My usage is of the format:

val hadoopConfig = new Configuration()
hadoopConfig.set(s"fs.azure.sas.${blobContainerName}.${accountName}.blob.windows.core.net",
        sasKey)
      hadoopConfig.set("fs.defaultFS",
        s"wasbs://${blobContainerName}@${accountName}.blob.windows.core.net")
      hadoopConfig.set("fs.wasb.impl",
        "org.apache.hadoop.fs.azure.NativeAzureFileSystem")
      hadoopConfig.set("fs.wasbs.impl",
        "org.apache.hadoop.fs.azure.NativeAzureFileSystem$Secure")
     
  
    
val fs = FileSystem.get(
                new java.net.URI(s"wasbs://" +
                  s"${blobContainerName}@${accountName}.blob.windows.core.net"), hadoopConfig)

I am now writing unit tests for this code using azure storage emulator as the storage account. I went through this page but it only explains how to access azure emulator through web apis of AzureBlobClient. I need to figure out how to test my above code by accessing azure storage emulator using hadoop FileSystem apis. I have tried the following way but this does not work:

val hadoopConfig = new Configuration()
    hadoopConfig.set(s"fs.azure.sas.{containerName}.devstoreaccount1.blob.windows.core.net",
      "Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==")
    hadoopConfig.set("fs.defaultFS",
      s"wasbs://{containerName}@devstoreaccount1.blob.windows.core.net")
    hadoopConfig.set("fs.wasb.impl",
      "org.apache.hadoop.fs.azure.NativeAzureFileSystem")
    hadoopConfig.set("fs.wasbs.impl",
      "org.apache.hadoop.fs.azure.NativeAzureFileSystem$Secure")
    val fs = FileSystem.get(
      new java.net.URI(s"wasbs://{containerName}@devstoreaccount1.blob.windows.core.net"), hadoopConfig)

Solution

  • I was able to solve this problem and connect to storage emulator by adding the following 2 configurations:

    hadoopConfig.set("fs.azure.test.emulator",
      "true")
    hadoopConfig.set("fs.azure.storage.emulator.account.name",
      "devstoreaccount1.blob.windows.core.net")