Search code examples
apache-sparkcloudera-cdhknime

Spark integration in knime


I am planning to execute spark from KNIME analytics platform. For this I need to install KNIME spark executors in the KNIME analytics platform. Can any one please let me know how to install KNIME spark executors in the KNIME analytics platform for hadoop distribution CDH 5.10.X.

I am referring the installation guide from the below link:

https://www.knime.org/knime-spark-executor


Solution

  • I could successfully configure/integrate spark in KNIME. I did it in CDH 5.7. I followed the following steps: 1.Downloaded knime-full_3.3.2.linux.gtk.x86_64.tar.gz. 2.Exract the above mentioned pacakge and run installation for KNIME. 3.After KNIME is installed goto File ->Install KNIME Extensions -> Install Bigdata extensions(Check all the Spark related extensions and proceed).

    Follow this link: https://tech.knime.org/installation-instructions#download

    4.Till now only the Bigdata related extensions have been installed but they need license to be functional. 5.License needs to be purchased.However,free trail for 30 days can be availed after which it needs to be purchased. Folow this link : https://www.knime.org/knime-spark-executor

    6.After plugins are installed we need to configure Spark-job-server. For that we need to download the compatible version of spark-job-server for the hadoop version we have.

    Folow this link for version of spark-job-server and its compatible version : https://www.knime.org/knime-spark-executor