I was wondering if you could tell me which NoSQL db or technology/tools should I use for my scenario. We are looking at replacing our OLAP cubes based on SQL server Analysis services with an open source technology coz the data is getting too huge to manage and queries are taking too long to return. We have followed every rule in the book to shard the data, optimize the design of the cube by using aggregations and partitions etc and still some of our distinct count queries take 1-2 mins :( The data size of our fact table is roughly around 250GB. And there are 10-12 dimensions connected in star schema fashion.
So we decided to give open source technologies like Hadoop/HBase/NoSQL dbs a try to see if they can solve our OLAP scenarios with minimal setup and onboarding.
Our main requirements for the new technology are
It has to get blazing fast or instantaneous results for distinct count queries ( < 2 secs)
Supports the concept of measures and dimensions (like in OLAP).
As there are so many new technologies and tools in the open source world today, I was hoping if you can help me point to the right direction.
Notes: I'm from Apache Kylin team.
Please refer to below answers which may bring some idea for you:
Our main requirements for the new technology are It has to get blazing fast or instantaneous results for distinct count queries ( < 2 secs)
--Luke: 90%tile query latency less than 5s is our current statistics. For <2s on distinct count, how many data you will have? Is approximate result ok?
Supports the concept of measures and dimensions (like in OLAP).
--Luke: Kylin is pure OLAP engine which has dimension (supports hierarchy also) and measure (Sum/Count/Min/Max/Avg/DistinctCount) definition
Support SQL like query language as many of our developers are SQL experts. --Luke: Kylin support ANSI SQL interface (most SELECT functions)
Ability to connect Excel/Tableau to visualize the data.
--Luke: Kylin has ODBC Driver works very well with Tableau, Excel/PowerBI will coming soon.
Please let's know if you have more questions.
Thanks.