Search code examples
hadoophiveapache-sparkshark-sql

Which is better in term of speed, Shark or spark


I am very confusing about this two.I know shark is same as hive with 100x faster, work on spark. I want to know main difference between spark and shark. Which is better mean faster.

When I have to use spark or when shark?????


Solution

  • Spark is a framework for distributed data processing, you can write your code in Scala, Java and Python. Shark was renamed to SparkSQL and it is some kind of SQL engine on top of Spark - you write SQL queries and they are executed using Spark framework.

    Here's Spark programming guide: https://spark.apache.org/docs/latest/programming-guide.html Here's Spark SQL guide: https://spark.apache.org/docs/latest/sql-programming-guide.html

    So if you write a Spark SQL query, it would be converted to Spark code and executed, which means that in general you can write a Spark code that would work with the same speed or faster than Spark SQL query