Search code examples
hadoophiveflume

Hive External table not showing anything


I am trying to learn Hive by following twitter data tutorial from the below link. https://github.com/cloudera/cdh-twitter-example/

I have successfully installed and configured hadoop and hive and tested simple text file load into hive table. All working good so far.

However, even thought files existed in hdfs, external table is showing nothing.

I used the below code to create tables.

CREATE EXTERNAL TABLE (
... 
 Columns ....
...
)
PARTITIONED BY (datehour INT)
ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe'
LOCATION '/user/flume/tweets';

I think where the problem comes in is the folder structure in my hdfs .. it's currently follows year/month/day/hour/ like below.

/user
  -- /flume
      -- /tweets
          -- /2015         
              -- 04        
                -- 01      
                 -- 13     
                 -- 14
                -- 02
                 -- 15
                 -- 16

Is there a way to set the partition correctly for this folder structure when creating external table in Hive ?

Thanks in advance for your help...


Solution

  • You have to add the partition to the table.

      ADD JAR   your-serde-jar-file-path.jar
    
    
      ALTER TABLE tweets ADD IF NOT EXISTS PARTITION (datehour = 2015040113) LOCATION '/user/flume/tweets/2015/04/01/13';
    

    -you have to pass datehour and partionpath form oozie cord file.

     ADD JAR ${JSON_SERDE};
    
    ALTER TABLE tweets ADD IF NOT EXISTS PARTITION (datehour ${DATEHOUR}) LOCATION '${PARTITION_PATH}';
    

    Please refer it http://blog.cloudera.com/blog/2013/01/how-to-schedule-recurring-hadoop-jobs-with-apache-oozie/