Search code examples
hivehiveqlgoogle-cloud-dataproc

How to store the result of remote hive query to a file


I'm trying to run a hive query on Google Compute Engine. My Hadoop service is on Google Dataproc. I submit the hive job using this command -

gcloud dataproc jobs submit hive --region=my-region --cluster=my-cluster-name -f file.hql > result.txt

and

gcloud dataproc jobs submit hive --region=my-region --cluster=my-cluster-name -e="use test;select * from emp;" > result.txt

I expect to see the result of the query in result.txt but this is all I get in the txt file -

done: true
driverControlFilesUri: gs://my-gcs-bucket-for-dataproc/google-cloud-dataproc-metainfo/27f9-f4a5-4df2-a311-e41a92/jobs/ea7ab2164/
driverOutputResourceUri: gs://my-gcs-bucket-for-dataproc/google-cloud-dataproc-metainfo/1f309-f4a5-4df2-a311-e4182/jobs/eafab0e2164/driveroutput
hiveJob:
queryFileUri: gs://my-gcs-bucket-for-dataproc/google-cloud-dataproc-metainfo/1ff9-f4a5-4df2-a311-e412/jobs/ea781f64/staging/file.hql
jobUuid: 91db33-ee0e-391b-b46d-37b276
placement:
  clusterName: my-cluster-name
  clusterUuid: my-cluster-uuid
reference:
  jobId: ea7ab0e2164
  projectId: my-project
status:
  state: DONE
  stateStartTime: '2022-02-07T09:33:44.317237Z'
statusHistory:
- state: PENDING
  stateStartTime: '2022-02-07T09:33:16.724561Z'
- state: SETUP_DONE
  stateStartTime: '2022-02-07T09:33:16.762680Z'
- details: Agent reported job success
  state: RUNNING
  stateStartTime: '2022-02-07T09:33:18.403518Z'
yarnApplications:
- name: HIVE-94a5b7-8bc7-4dc9-a016-81ab721
  progress: 1.0
  state: RUNNING
  trackingUrl: http://my-cluster-name:8088/proxy/application_1692_0008/

Any help would be appreciated. Thanks.


Solution

  • Query result is in stderr. Try &> result.txt to redirect both stdout and stderr, or 2> result.txt to redirect stderr only.