Search code examples
hivehdfsexternalpartition

Creation of a partitioned external table with hive: no data available


I have the following file on HDFS: enter image description here

I create the structure of the external table in Hive:

CREATE EXTERNAL TABLE google_analytics(
  `session` INT)
PARTITIONED BY (date_string string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LOCATION '/flumania/google_analytics';

ALTER TABLE google_analytics ADD PARTITION (date_string = '2016-09-06') LOCATION '/flumania/google_analytics';

After that, the table structure is created in Hive but I cannot see any data: enter image description here

Since it's an external table, data insertion should be done automatically, right?


Solution

  • I think the problem was with the alter table command. The code below solved my problem:

    CREATE EXTERNAL TABLE google_analytics(
      `session` INT)
    PARTITIONED BY (date_string string)
    ROW FORMAT DELIMITED
    FIELDS TERMINATED BY ','
    LOCATION '/flumania/google_analytics/';
    
    ALTER TABLE google_analytics ADD PARTITION (date_string = '2016-09-06');
    

    After these two steps, if you have a date_string=2016-09-06 subfolder with a csv file corresponding to the structure of the table, data will be automatically loaded and you can already use select queries to see the data.

    Solved!