Search code examples
How do I access the fields within a VARIANT column while reading from Kafka using Spark?...

apache-sparkpysparkapache-kafkadatabricksvariant-format

Read More
Databricks Community Edition Cluster won't start...

apache-sparkdatabricks

Read More
How can I turn off rounding in Spark?...

pythondataframeapache-sparkpysparkrounding

Read More
Maven build with jenkins for scala spark program : "No primary artifact to install, installing ...

scalamavenapache-sparkjenkinsjenkins-pipeline

Read More
Whether repartition() will always shuffle even before an action is triggered...

apache-sparkpysparkapache-spark-sql

Read More
How to save RDD data into json files, not folders...

scalaapache-sparkspark-streaming

Read More
Create dockerfile to use airflow and spark, pip backtracking runtime issue comes out...

pythondockerapache-sparkairflowjava-11

Read More
Vertica data into pySpark throws "Failed to find data source"...

python-3.xmavenapache-sparkpysparkvertica

Read More
Using Java SparkSQL getting:java.lang.NoSuchMethodError: 'scala.collection.mutable.ArrayBuffer o...

javamavenapache-spark

Read More
Monotonically increasing id order...

pythondataframeapache-sparkpysparkapache-spark-sql

Read More
checksum error while writing data to delta table. Is there a way to fix this issue?...

apache-sparkpysparkdelta-lake

Read More
kafka offset in spark...

scalaapache-sparkapache-kafka

Read More
Spark Large single Parquet file to Delta Failure with Spark SQL...

apache-sparkapache-spark-sqlparquetazure-synapse

Read More
spark cassandra connector problem using catalogs...

apache-sparkcassandraspark-cassandra-connector

Read More
How to run twitter popular tags of Spark streaming using scala?...

scalatwitterstreamingapache-spark

Read More
Spark on Docker Fails to Connect to AWS RDS PostgreSQL via Bastion...

postgresqlamazon-web-servicesdockerapache-sparkdocker-compose

Read More
Spark SQL Row_number() PartitionBy Sort Desc...

pythonapache-sparkpysparkapache-spark-sqlwindow-functions

Read More
Spark transactional write operation using temporary directories...

apache-sparkamazon-s3hdfs

Read More
Pyspark Jupyter - dataframe created in java code vs python code...

apache-sparkpysparkjupyter-notebookpy4j

Read More
FileNotFoundException when trying to save DataFrame to parquet format, with 'overwrite' mode...

apache-sparkpysparkapache-spark-sql

Read More
How can I account for AM/PM in string to DateTime conversion in pyspark?...

apache-sparkdatetimepysparkapache-spark-sql

Read More
Spark reads more documents than Mongo collection actually returns...

mongodbapache-sparkapache-spark-sql

Read More
Spark 3.1.2: Kubernetes Client Closed Warning Leading to Executor Task Hanging – How to Fix or Work ...

apache-sparkkubernetes

Read More
What is the use of --driver-class-path in the spark command?...

apache-spark

Read More
Does Spark preserve record order when reading in ordered files?...

apache-spark

Read More
How to get week of month in Spark 3.0+?...

apache-sparkdatetimepysparkapache-spark-sqlapache-spark-3.0

Read More
What is a glom?. How it is different from mapPartitions?...

apache-sparkrdd

Read More
Pyspark java UDF java.lang.OutOfMemoryError: Requested array size exceeds VM limit. SQLSTATE: 39000...

apache-sparkpyspark

Read More
pyspark foreachPartition not getting executed...

pythonapache-sparkpyspark

Read More
Spark Excel library unable to read whole columns, only specific data address ranges...

javaapache-sparkapache-poispark-excel

Read More
BackNext