Search code examples
apache-sparkhadoophortonworks-data-platform

Apache Spark 2.3.1 compatibility with Hadoop 3.0 in HDP 3.0


I am plannig to upgrade from Hortonworks Data platform[HDP] (version 2.6.x) to HDP 3.0. But, there seems to be some major bugs in Apache Spark 2.3.x and its integration with Hadoop 3.0, which are still unresolved in Apache Spark JIRA issues. Although the Spark development team is working to resolve them. Do these issues have a workaround/resolutions by Hortonworks team, or do they still exist in HDP 3.0?

Some unresolved issues concerning my use case:

  1. Spark DataFrames does not work with Hadoop 3.0 https://issues.apache.org/jira/browse/SPARK-18673
  2. Kerberos Ticket renewal fails in Hadoop 3 https://issues.apache.org/jira/browse/SPARK-24493
  3. Spark run on Hadoop 3 https://issues.apache.org/jira/browse/SPARK-23534

Solution

  • I checked integration with HDP Spark-2.3.1 and Hadoop - 3.0.1. It works perfectly and above issues were resolved in HDP version of Spark, but were not provided in HDP-3 release notes. Check the community answer