Search code examples
hadoophivehive-metastore

Which metastore does Hive use?


Before start, I need to say that I am new to the concept of Big Data and Apache Hive. When following this post to install Hive, I saw that they create another database called metastore. After installing and configuring successfully, I was able to perform some basic commands (create database, table, load data, etc) in Hive shell (by typing ./hive in Hive's bin folder).
When I check in mysql database, there appears 2 new databases: hive & metastore, and I believe metastore database is where the metadata of tables, databases are stored. However, it is not what I expected. Here is the result.
My question is if I have installed Hive successfully or not, and what is the purpose of those 2 databases?
P/s: I have searched for Hive metastore and there are 3 different modes, but I am not sure if that is my case in this situation.


Solution

  • By default, Hive Metastore will use Apache Derby.

    It's recommended you change this to be Postgres, Mysql, or Oracle.

    The mysql command CREATE database metastore; doesn't mean anything specific. The hive-site.xml sets the JDBC connection string where the metastore database is configured, and you can provide any name you want.

    Similar for hive.

    Unclear what you expected in those databases for tables, but it's metadata, not actual HDFS records