I am designing a text mining pipeline in UIMA DUCC as follows:
|-----------------|
| | ==CAS_1==> Pipeline A ==> Consumer A
| CAS Multiplier | ==CAS_2==> Pipeline B ==> Consumer B
| | ==CAS_3==> Pipeline C ==> Consumer C
|-----------------|
I intend to run Piepline A, B and C in parallel. I believe it can be done using flow controller. Is my unsderstanding right ? If yes, how do I define multiple CCs. The process_descriptor_CC
field in the job description file takes only one consumer. How can we pass multiple consumers and its piepline assosciation ?
If the intention is to process a large collection of documents with high throughput then the three pipelines, each including its CAS consumer, would all be in the AE (process_descriptor_AE) and the AE would include a custom flow controller to route CASes as desired. CASes in an AE would run one at a time, but multiple CM+AE threads could be run in parallel by specifying the number of JP threads (process_thread_count) to be greater than 1.