Search code examples
spring-dataspring-cloudspring-cloud-streamspring-cloud-dataflow

Spring cloud data flow internal communication


Trying to use spring cloud data flow for an data pipeline where source read data from a rest stream and the sink would save it to Database.Currently i did a POC using Spring Webflux getting the streaming data and saving to DB which works fine. Trying to do the same with Spring Cloud data flow and trying to understand How exactly do the Source / Processor / Sink communicate each other. In my scenario its a Batch/ short lived application which runs periodically and consumes data from the rest endpoint. I looked t the document to understand how the source and sink communicate but i couldnt find any.So how does the source and sink transfer data in the case of short lived application ? My understanding is each of the them run in separate JVMs so they need a way to communicate/transfer data. 1 ) IS my understanding correct? 2) IS it via messaging ?


Solution

  • Your question is more generic as it leads to why do we use a messaging system for streaming applications. So, please search for why do we need messaging system.

    Spring Cloud Data Flow leverages Spring Cloud Stream for running the streaming applications. The Spring Cloud Stream provides binder implementations to bind your event driven/streaming applications into the messaging system (RabbitMQ, Apache Kafka etc.,).

    Given all this, it also depends on what you need in your application. Especially, if you don't have your streaming applications distributed, and don't necessarily need the loose coupling between the producer and the consumer, you can build your application without using the messaging system.