Search code examples
apachehadoophiveolapkylin

Apache Kylin - wrong output at the first step of cube building


I'am trying to build my first cube using Apache Kylin, everything goes fine until last step where I'm getting error:

java.lang.IllegalStateException: Can't get cube source record count.
at com.google.common.base.Preconditions.checkState(Preconditions.java:149)
at org.apache.kylin.job.cube.UpdateCubeInfoAfterBuildStep.doWork(UpdateCubeInfoAfterBuildStep.java:104)
at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:132)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

According to this issue https:// github.com/KylinOLAP/Kylin/issues/101 above error occurs because Kylin attempt to find this pattern in hive's output: "HDFS Read: (\d+) HDFS Write: (\d+) SUCCESS".

Correct output from cube building tutorial: github.com/KylinOLAP/Kylin/wiki/Kylin-Cube-Build-and-Job-Monitoring-Tutorial:

https://i.sstatic.net/W02r2.png

My output in Kylin looks corrupted:

https://i.sstatic.net/lIZeH.png

However when I check Hive log it looks ok:

2015-05-27 08:40:13,419 INFO  [main]: ql.Driver (Driver.java:execute(1285)) - Starting command: 
INSERT OVERWRITE TABLE kylin_intermediate_Kubek_19700101000000_2922789940817071255_f23ac1b1_10fe_4112_ac9e_b4e6baf07654 SELECT
FACT_TABLE.DATE
,FACT_TABLE.MONEY_ADVERTISER
,FACT_TABLE.MONEY_PUBLISHER
FROM DEFAULT.ADVSTATS as FACT_TABLE 
...
2015-05-27 08:45:05,132 INFO  [main]: ql.Driver (SessionState.java:printInfo(824)) - MapReduce Jobs Launched: 
2015-05-27 08:45:05,148 INFO  [main]: ql.Driver (SessionState.java:printInfo(824)) - Stage-Stage-1: Map: 1   Cumulative CPU: 17.32 sec   HDFS Read: 44644035 HDFS Write: 2347008 SUCCESS
2015-05-27 08:45:05,153 INFO  [main]: ql.Driver (SessionState.java:printInfo(824)) - Total MapReduce CPU Time Spent: 17 seconds 320 msec
2015-05-27 08:45:05,167 INFO  [main]: ql.Driver (SessionState.java:printInfo(824)) - OK

I'm using Hortonworks Sandbox 2.2:

hadoop-2.6.0
hbase-0.98.12
hive-0.14.0
zookeeper-3.4.6

Can someone tell me why my kylin logs preview looks like that and first of all could it be a reason of last step error?


Solution

  • Few days ago kylin developers commited bypass for this kind of issue

    https://github.com/apache/incubator-kylin/commit/a4692dba681bc2f136e02c64565639eb0080fcc9

    Becasue sometimes hadoop may fail to get counter, even if the job succeed from now Kylin gives warning instead of error when failed to get cube source.

    All I had to do was rebuild Kylin.