Search code examples
How to extract the query result from a Hive job output logs using DataprocHiveOperator?...


pythonpandashiveairflowgoogle-cloud-dataproc

Read More
Can you trigger Python Scripts from Dataproc?...


pythonhadoopgoogle-cloud-platformgoogle-cloud-dataprocgoogle-cloud-storage

Read More
Bigquery as metastore for Dataproc...


apache-sparkgoogle-bigquerygoogle-cloud-dataprocspark-bigquery-connector

Read More
ClassNotFoundException: Failed to find data source: bigquery...


javamavenapache-sparkgoogle-bigquerygoogle-cloud-dataproc

Read More
Dataproc Serverless - how to set javax.net.ssl.trustStore property to fix java.security.cert.CertPat...


google-cloud-platformssl-certificategoogle-cloud-dataprocgoogle-cloud-dataproc-serverless

Read More
GCP Dataproc custom image Python environment...


pythongoogle-cloud-platformpysparkgoogle-cloud-dataproc

Read More
Serverless Dataproc Error- Batch ID is required...


google-cloud-platformgoogle-cloud-dataprocgoogle-cloud-dataproc-serverless

Read More
Is there a way to import and run functions from saved .py files in a Jupyter notebook running on a G...


pythongoogle-cloud-platformjupyter-notebookcluster-computinggoogle-cloud-dataproc

Read More
Custom Container Image for Google Dataproc pyspark Batch Job...


pysparkgoogle-cloud-dataprocgoogle-cloud-dataproc-serverless

Read More
Component Gateway with DataprocOperator on Airflow...


pythongoogle-cloud-platformairflowgoogle-cloud-dataproc

Read More
StructuredStreaming withWatermark - TypeError: 'module' object is not callable...


apache-sparkgoogle-cloud-platformspark-structured-streaminggoogle-cloud-dataprocwindowing

Read More
Why is my hdfs capacity not remainng constant?...


apache-sparkhadoopapache-spark-sqlgoogle-cloud-dataprocdataproc

Read More
GCP Dataproc - Failed to construct kafka consumer, Failed to load SSL keystore dataproc.jks of type ...


sslgoogle-cloud-platformapache-kafkaspark-structured-streaminggoogle-cloud-dataproc

Read More
GCP dataproc - java.lang.NoClassDefFoundError: org/apache/kafka/common/serialization/ByteArraySerial...


apache-sparkgoogle-cloud-platformpysparkapache-kafkagoogle-cloud-dataproc

Read More
How to store the result of remote hive query to a file...


hivehiveqlgoogle-cloud-dataproc

Read More
GCP Dataproc - cluster creation failing when using connectors.sh in initialization-actions...


shellapache-sparkgoogle-cloud-platformgoogle-cloud-dataproc

Read More
(gcloud.dataproc.batches.submit.spark) unrecognized arguments: --subnetwork=...


apache-sparkgcloudgoogle-cloud-dataproc

Read More
Dataproc Cluster creation is failing with PIP error "Could not build wheels"...


python-3.xpipairflowgoogle-cloud-dataprocpython-wheel

Read More
Why is there just 1 job id in dataproc when there are multiple actions in the pyspark script?...


apache-sparkpysparkgoogle-cloud-dataprocdataproc

Read More
PySpark runs in YARN client mode but fails in cluster mode for "User did not initialize spark c...


apache-sparkpysparkhadoop-yarngoogle-cloud-dataprocdataproc

Read More
Where to find spark log in dataproc when running job on cluster mode...


pysparkgoogle-cloud-dataprocdataproc

Read More
Dataproc Java client throws NoSuchMethodError setUseJwtAccessWithScope...


javagoogle-cloud-platformgoogle-cloud-dataproc

Read More
Need information on dataproc image version 1.5.54...


google-cloud-dataproc

Read More
How are spark jobs submitted in cluster mode?...


apache-sparkpysparkgoogle-cloud-dataprocspark-submitdataproc

Read More
Can run code in pyspark shell but the same code fails when submitted with spark-submit...


apache-sparkpysparkhadoop-yarngoogle-cloud-dataprocspark-submit

Read More
Facing Issue with DataprocCreateClusterOperator (Airflow 2.0)...


pythonairflowgoogle-cloud-dataprocgoogle-cloud-composerairflow-2.x

Read More
Run a Data Fusion pipeline only when a file exist...


google-cloud-platformetlgoogle-cloud-dataprocgoogle-cloud-data-fusion

Read More
Dataproc: What is the primary use case of local Hive metastore?...


google-cloud-dataproc

Read More
Why is adding org.apache.spark.avro dependency is mandatory to read/write avro files in Spark2.4 whi...


scalaapache-sparkgoogle-cloud-dataprocspark-avro

Read More
NoClassDefFoundError com/microsoft/aad/adal4j/AuthenticationException while connecting to Azure SQL ...


javaapache-sparkgoogle-cloud-platformazure-sql-databasegoogle-cloud-dataproc

Read More
BackNext