Search code examples
performancehadoopigniteapache-drill

Performance of Apache Ignite vs Apache Drill for SQL


I need to fetch data from some big MySQL tables to be able to show on dashboard/web portal. Mainly, my focus is to improve SQL performance given the size of datasets.

Also, is Apache Ignite less scalable than Apache Drill considering Ignite uses RAM as a primary data source?

Please let me know in case, more detail is needed.

I have been through these links: http://drcos.boudnik.org/2015/04/apache-ignite-vs-apache-spark.html https://mpouttuclarke.wordpress.com/2016/01/04/why-i-tried-apache-spark-and-moved-on/

Does using optional HDFS layer beneath IGFS slows down the performance of the system to the level of SparkSQL? https://ignite.apache.org/features/igfs.html


Solution

  • Drill is simply a SQL query engine mainly for NoSQL databases. It's performance is good as compare to hive and many NOSQL databases because of in memory processing.

    Check how Query execution works in Drill - here.

    Scalability

    Apache drill is highly scalable and no need to worry about that.

    You can not compare two overlapping tools in theories. I suggest you to do a POC taking some sample MySQL data on both the tools. Performance depends a lot on your use case.

    Drill is best for querying complex JSON files (because of its columnar layout) and solving polyglot usecases (performing join across multiple datastores).