Search code examples
springspring-batchspring-cloud-dataflowspring-cloud-dataflow-ui

Setting global properties in Composed Task which would be accessible in each of subtasks


Setting 'common' properties for child tasks is not working

The SCDF version I'm using is 2.9.6.

I want to make CTR A-B-C, each of tasks does follows:

A : sql select on some source DB

B : process DB data that A got

C : sql insert on some target DB

Simplest way to make this work seems to define shared work directory folder Path "some_work_directory", and pass it as application properties to A, B, C. Under {some_work_directory}, I just store each of task result as file, like select.result, process.result, insert.result, and access them consequently. If there is no precedent data, I could assume something went wrong, and make tasks exit with 1.

================

I tried with a composed task instance QWER, with two task from same application "global" named as A, B. This simple application prints out test.value application property to console, which is "test" in default when no other properties given.

If I tried to set test.value in global tab on SCDF launch builder, it is interpreted as app.*.test.value in composed task's log. However, SCDF logs on child task A, B does not catch this configuration from parent. Both of them fail to resolve input given at launch time.

If I tried to set test.value as row in launch builder, and pass any value to A, B like I did when task is not composed one, this even fails. I know this is not 'global' that I need, it seems that CTR is not working correctly with SCDF launch builder.

The only workaround I found is manually setting app.QWER.A.test.value=AAAAA and app.QWER.B.test.value=BBBBB in launch freetext. This way, input is converted to app.QWER-A.app.global4.test.value=AAAAA, app.QWER-B.app.global4.test.value=BBBBB, and print well.

I understand that, by this way, I could set detailed configurations for each of child task at launch time. However, If I just want to set some 'global' that tasks in one CTR instance would share, there seems to be no feasible way.

Am I missing something? Thanks for any information in advance.


Solution

  • CTR will orchestrate the execution of a collection of tasks. There is no implicit data transfer between tasks. If you want the data from A to be the input to B and then output of B becomes the input of C you can create one Task / Batch application that have readers and writers connected by a processor OR you can create a stream application for B and use JDBC source and sink for A and C.