Search code examples
amazon-web-servicesemramazon-emr

EMR cluster with external MySQL as Hive metastore


I am trying to set up an EMR cluster with external MySQL as Hive metastore. I created MySQL database "metastore" on an EC2 box and used below in hive-site.xml

<configuration>   <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://10.10.xxx.xxx:3306/metastore?createDatabaseIfNotExist=true</value>
    <description>JDBC connect string for a JDBC metastore</description>   </property>   <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>hiveuser</value>
    <description>Username to use against metastore database</description>   </property>   <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>xxxxxx</value>
    <description>Password to use against metastore database</description>   </property> </configuration>

The cluster creation is failing with below error (log from stderr file)

org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version. * schemaTool failed org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version. schemaTool failed * /mnt/var/lib/hadoop/steps/s-xxxxxxxxx/./hive-script:617: Error executing cmd: /usr/share/aws/emr/scripts/hive-script "--install-hive" "--base-path" "s3://us-west-2.elasticmapreduce/libs/hive" "--hive-versionsCommand exiting with ret '1'

Please help.


Solution

  • There was some AWS security group issue. By allowing accesss to MySQL port I solved this issue