I want to create a Spoon
transformation which will work on multiple values of an column.
Input to my transformation is an CSV file. In that CSV file there's one column named 'Technology' which contains 0 or more values seperated by semi colon as follows.
+------------------------------------------------------+
row_id | Technology
+------------------------------------------------------+
1 | Cobol ; Db2 ; Jcl ; Vsam ; Cics ; Changeman ;
2 | Oracle ; Sql ; Db2 ; Oracle 9i ;
3 | Windows 2000 ; SQL ;
+------------------------------------------------------+
I have one table in database named 'Technologies' and its schema is as follows :
+----------------------+
Technologies
+----------------------+
id | technology_name
+----------------------+
where id
column is set to auto increment.
I want to insert values of technology
column only if that value is not present in Technologies
table.
Can anyone please tell me
1) Which type of step to be used to split values of technology
column?
2) How to insert value only once? For example in row 1
and row 2
, Db2
is repeated but I want to insert Db2
only once.
Thanks in advance !
Use the "Split Fields" (Under "Transform") to split the contents.
CSV file input --> Split Fields --> rest of transformation
Set the "Field to Split" to "Technology" and set the "Delimiter" to a semi-colon.
Regarding the non-repeating field, my suggestion would be you make the name itself the key to the table. Shift it to lower-case and replace any spaces / special characters with database safe equiv's and then make that the primary key. You should end up with a table full of only unique instances.
hth