Search code examples
scalaazureapache-sparkazure-data-lake-gen2

Reading file from Azure Data Lake Storage V2 with Spark 2.4


I am trying to read a simple csv file Azure Data Lake Storage V2 with Spark 2.4 on my IntelliJ-IDE on mac

Code Below

package com.example

import org.apache.spark.SparkConf
import org.apache.spark.sql._



object Test extends App {

  val appName: String = "DataExtract"
  val master: String = "local[*]"
  val sparkConf: SparkConf = new SparkConf()
    .setAppName(appName)
    .setMaster(master)
    .set("spark.scheduler.mode", "FAIR")
    .set("spark.sql.session.timeZone", "UTC")
    .set("spark.sql.shuffle.partitions", "32")
    .set("fs.defaultFS", "abfs://development@xyz.dfs.core.windows.net/")
    .set("fs.azure.account.key.xyz.dfs.core.windows.net", "~~key~~")


  val spark: SparkSession = SparkSession
    .builder()
    .config(sparkConf)
    .getOrCreate()
  spark.time(run(spark))


def run(spark: SparkSession): Unit = {

  val df = spark.read.csv("abfs://development@xyz.dfs.core.windows.net/development/sales.csv")
  df.show(10)

}

}

It's able to read, and throwing security exception

Exception in thread "main" java.lang.NullPointerException
    at org.wildfly.openssl.CipherSuiteConverter.toJava(CipherSuiteConverter.java:284)
    at org.wildfly.openssl.OpenSSLEngine.toJavaCipherSuite(OpenSSLEngine.java:1094)
    at org.wildfly.openssl.OpenSSLEngine.getEnabledCipherSuites(OpenSSLEngine.java:729)
    at org.wildfly.openssl.OpenSSLContextSPI.getCiphers(OpenSSLContextSPI.java:333)
    at org.wildfly.openssl.OpenSSLContextSPI$1.getSupportedCipherSuites(OpenSSLContextSPI.java:365)
    at org.apache.hadoop.fs.azurebfs.utils.SSLSocketFactoryEx.<init>(SSLSocketFactoryEx.java:105)
    at org.apache.hadoop.fs.azurebfs.utils.SSLSocketFactoryEx.initializeDefaultFactory(SSLSocketFactoryEx.java:72)
    at org.apache.hadoop.fs.azurebfs.services.AbfsClient.<init>(AbfsClient.java:79)
    at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.initializeClient(AzureBlobFileSystemStore.java:817)
    at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.<init>(AzureBlobFileSystemStore.java:149)
    at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.initialize(AzureBlobFileSystem.java:108)

Can anyone help me, what is the mistake?


Solution

  • As per my research, you will receive this error message when you have incompatible jar with the hadoop version.

    I would request you to kindly go through the below issues:

    http://mail-archives.apache.org/mod_mbox/spark-issues/201907.mbox/%3CJIRA.13243325.1562321895000.591499.1562323440292@Atlassian.JIRA%3E

    https://issues.apache.org/jira/browse/HADOOP-16410