Search code examples
hivehbasehcatalog

What is correlation between HBase and HCatalog?


Can enyone explain, what is the corellation between HCatalog and HBase, please?

I've found these definitions:

Apache HCatalog HCatalog is a metadata abstraction layer for referencing data without using the underlying file­names or formats. It insulates users and scripts from how and where the data is physically stored.

Apache HBase HBase (Hadoop DataBase) is a distributed, column oriented database. HBase uses HDFS for the underlying storage. It supports both batch style computations using MapReduce and point queries (random reads).

Whet we use CREATE TABLE in Hive, it creates table in HCatalog. I just don't get it. Why not in real DATABASE which is HBase?

HCatalog seems to be some kind of metedata repository for all data stores. Does it mean it also keeps information about databases and tables in HBase?

I'll be grateful for explanation

Regards Pawel


Solution

  • When you CREATE TABLE in HIVE it registers it in HCatalog. A Table in Hive may be an HBase table but it can also be an abstraction above HDFS files and directories

    You can find a nice explanation of HCatalog on HortonWorks' site