Search code examples
streamsets

How can I differentiate the data coming from multiple HTTP client origins in StreamSets


I have 6 pipelines each have the HTTP client origin connected to the SDCRPC destinations, my plan is to make another pipeline with SDCRPC origin and destination to hive tables.

My question is after connecting to the SDCRPC origin how can I differentiate the data because each HTTP pipeline pulls the data related to one identical table.

Any examples or online resources will be appreciated.


Solution

  • Add an expression evaluator to each of the 6 HTTP->SDCRPC pipelines, each of them adding a different header attribute value to the records. Now your SDCRPC origin pipeline can look at that header attribute to figure out which pipeline it came from.