Search code examples
apache-sparkpysparkspark-structured-streaming

handlng empty batches from incoming kinesis stream in spark structured stream


we are reading data from kinesis and outputting it to file using spark structure stream. kinesis implementation is generating empty batches when there is no data in stream.these empty batches are creating blank files as output? any idea how we can stop spark to write out blank files


Solution

  • partitioning output fixed the empty set problem