Search code examples
migrationdatabase-migrationetltalend

How to create a Talend job that uses dynamic schema for each run


I am developing a migration tool and using the Talend ETL tool (Free edition).

Is it possible to create a Talend job that uses dynamic schema every time it runs i.e. no hard-coded mappings in tMap component.

I want user to give a input CSV/Excel file and the job should create mappings on the basis of that input file. Is it possible in talend?


Solution

  • Yes, this can be done in Talend but if you do not wish to use a tMap then your table and file must match exactly. The way we have implemented it is for stage tables which are all datatype of varchar. This works when you are loading raw data into a stage table, and your validation is done after the load, prior to loading the stage data into a data warehouse.

    Here is a summary of our method:

    1. the filenames contain the table name so the process starts with a tFileList and parsing out the table name from the file name.
    2. using tMSSQLColumnList obtain each column name, type, and length for the table (one way is to store it as an inline table in tFixedFlowInput)
    3. run this thru a tSetDynamicSchema to produce your dynamic for that table
    4. use a file input reference the dynamic schema.
    5. load that into a MSSQLOutput again referencing the dynamic schema.

    One more note on data types. It may work with data types than varchar, but our stage tables only have varchar and datetime. We had issues with datetime, so we filtered out those column types with a tMap.

    Keep in mind, this is a summary to point you in the right direction, not a precise tutorial. But with this info in your hands, it can save you many hours of work while building your solution.