Search code examples
oracle-databasegoogle-cloud-data-fusion

What does the parameters do in datafusion pipelines source connection


enter image description here

enter image description here

I am not able to understand the use of these arguments. I am having the source as oracle db plugin wherein these all arguments are also present..


Solution

  • Bounding Query - This field is required since this will return the minimum and maximum of the values of the Split-by Field Name field.

    For example, SELECT MIN(id),MAX(id) FROM table. Not required if the Number of Splits to Generate is set to 1.

    Split-By Field Name - As per this document Split-By Field Name is Field Name which will be used to generate splits.

    For Example:

    The SELECT query is used to import data from the specified table. You can specify an arbitrary number of columns to import, or import all columns using *. The Query should contain the ‘$CONDITIONS’ string. For example, ‘SELECT * FROM table WHERE $CONDITIONS’. The ‘$CONDITIONS’ string will be replaced by Split-by Field Name field limits specified by the bounding query. The ‘$CONDITIONS’ string is not required if Number of Splits to Generate is set to 1.

    Number of Splits - It is optional. It is the number of splits to generate.

    Fetch Size - This field is optional. It is the number of rows to fetch at a time per split. Larger Fetch Size can result in a faster import with the trade-off of higher memory usage.Default is 1000.

    Default Batch Value - This field is optional. It represents the default batch value that triggers an execution request.

    Default Row Prefetch- It is optional. This field denotes the default number of rows to prefetch from the server.