I have a RDD and want to add more RDD to it. How can I do it in Spark? I have code like below. I want to return RDD from the dStream I have.
JavaDStream<Object> newDStream = dStream.map(this);
JavaRDD<Object> rdd = context.sparkContext().emptyRDD();
return newDStream.wrapRDD(context.sparkContext().emptyRDD());
I do not find much documentation about wrapRDD method of JavaDStream class provided by Apache Spark.
You can use JavaStreamingContext.queueStream
and fill it with a Queue<RDD<YourType>>
:
public JavaInputDStream<Object> FillDStream() {
LinkedList<RDD<Object>> rdds = new LinkedList<RDD<Object>>();
rdds.add(context.sparkContext.emptyRDD());
rdds.add(context.sparkContext.emptyRDD());
JavaInputDStream<Object> filledDStream = context.queueStream(rdds);
return filledStream;
}