Search code examples
hadoopmapreducehivehortonworks-data-platformtez

hive query BlockMissingException


I am having issues on both TEZ and MapReduce execution engines. Both appear related to permissions but for the life of me, I am lost.

When I execute it through TEZ I get this message:

org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-300459168-127.0.1.1-1478287363661:blk_1073741961_1140 file=/tmp/hive/hiveuser/_tez_session_dir/03029ffd-a9c2-43de-8532-1e1f322ec0cd/hive-hcatalog-core.jar

Looking at the file permissions in HDFS however they appear correct:

drwx------ - hiveuser hadoop 0 2016-11-11 09:54 /tmp/hive/hiveuser/_tez_session_dir/03029ffd-a9c2-43de-8532-1e1f322ec0cd

drwx------ - hiveuser hadoop 0 2016-11-11 09:54 /tmp/hive/hiveuser/_tez_session_dir/03029ffd-a9c2-43de-8532-1e1f322ec0cd/.tez

-rw-r--r-- 3 hiveuser hadoop 259706 2016-11-11 09:54 /tmp/hive/hiveuser/_tez_session_dir/03029ffd-a9c2-43de-8532-1e1f322ec0cd/hive-hcatalog-core.jar

On MapReduce the message is this

Could not obtain block: BP-300459168-127.0.1.1-1478287363661:blk_1073741825_1001 file=/hdp/apps/2.5.0.0-1245/mapreduce/mapreduce.tar.gz

File permissions on that one

-r--r--r-- 3 hdfsuser hadoop 51232019 2016-11-04 16:40 /hdp/apps/2.5.0.0-1245/mapreduce/mapreduce.tar.gz

Can anyone tell me what I am missing there? Please?


Solution

  • I finally figured this out and thought it would be friendly of me to post the solution. one of those that when you finally get it you think, "Ugh, that was so obvious". One important note, if you are having trouble with Hive make sure to check the Yarn logs too!

    My solution to this and so many other issues was ensuring all my nodes had all the other nodes ip addresses in their host files. This ensures Ambari picks up all the correct IPs by hostname. I am on Ubuntu so I did the following:

    $ vim /etc/hosts And then the file came out looking like this:

    127.0.0.1       localhost
    #127.0.1.1      ambarihost.com ambarihost
    # Assigning static IP here so ambari gets it right
    192.168.0.20    ambarihost.com ambarihost
    
    #Other hadoop nodes
    192.168.0.21    kafkahost.com kafkahost
    192.168.0.22    hdfshost.com hdfshost