Search code examples
hivehdfsgoogle-cloud-storageremote-accesshive-partitions

Can Hive load data from external location which is not on HDFS?


I am trying to understand that for the external table in Hive, can we have the location outside of HDFS, I mean is that I want to create my external table on top of Google storage location (gs://bucket-name/table-partitions).


Solution

  • It's not a difficult problem but requires out-of-box provisioning, something which isn't particularly well documented over Google Cloud. In order to fix it, update the following Hadoop configurations parameter:

    A. Point your service account key string to that of google service account (ex: domains include @test.gservice.com) for google.cloud.auth.service.account.email.

    B. Update the keyfile string to the .p12 location for google.cloud.auth.service.account.keyfile.

    C. Update google.cloud.auth.service.account.enable to true.