Search code examples
hivelocalminio

Hive problem connection to local Minio S3


I have configured my local S3 server with Minio. I can access to the files stored in it from Spark following these steps. But, if I try to configure Hive to access to a external parquet file stored in this server, I get following error:

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key must be specified by setting the fs.s3n.awsAccessKeyId and fs.s3n.awsSecretAccessKey properties (respectively).)

My hive version is: 1.1.

I'm using cdh5.16.1 with Hadoop 2.6.

My spark version is 1.6.

I have tried to modify the files (hive-site.xml and core-site.xml) with the properties specified here but I get the same error. I have also tried to add these properties in execution time, typing following commands in a Hive shell:

SET fs.s3a.endpoint=http://127.0.0.1:9003;
SET fs.s3a.access.key=ACCESSKEY;
SET fs.s3a.awsAccessKeyId=ACCESSKEY;
SET fs.s3a.secret.key=SECRETKEY;
SET fs.s3a.awsSecretAccessKey=SECRETKEY;
SET fs.s3a.path.style.access=true;
SET fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem;

Notice that I only have fs.s3a.access.key and fs.s3a.secret.key because I'm not using an AWS S3 (I'm using a local S3), but I have added AWS KEY properties to my config files because of the exception message that I'm getting. I have also tried to use s3n instead of s3a (To check if s3a is not compatible with my Hive version), but I get the same exception message.

The Create Table command that throws the exception:

  CREATE EXTERNAL TABLE aml.bgp_pers_juridi3(
  internal_id string,
  society_type string)
  ROW FORMAT SERDE
    'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
  STORED AS INPUTFORMAT
   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
  OUTPUTFORMAT
   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
  LOCATION   
's3n://oclrh65c034.isbcloud.isban.corp:9003/minio/entities/bgp_pers_juridi2'

Thanks in advance.


Solution

  • Finally I manage to get access to Cloudera Manager (the server was down and I didn't have permissions) and I have restarted all the services from it. You can also modify the files using Cloudera Manager, but if not (like me case) it will warn you that your configuration isn't updated in all the files that it should be, and it gives you the possibility to modify, automatically, all these files. I strongly recommend using Cloudera Manager to modify configuration properties in the different services because it modify these properties in all the related files and then it helps you to restart these services.