Search code examples
etltalendgreenplumhawq

Talend greenplumRow error handling


I want to create views in greenplum HAWQ using a simple talend job, that would basically have a fileinput that contains all the views then I need to execute the CREATE VIEW script.

Since these views (50-60.000) come from an oracle system I need to find the ones that we were unable to create.

Here's a mock up for my problem: enter image description here

I have a view already in the DB, and I want to create it 3 more times. This would obviously fail.

Here's the output:

Exception in component tGreenplumRow_2
org.postgresql.util.PSQLException: ERROR: relation "ad_apps_dependencies" already exists
    at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:1592)
.--------------.
|  tLogRow_4   |
|=------------=|
|componenterror|
|=------------=|
|componenterror|
'--------------'

.-----------------------------------------------------+-------------------------------------------------------------.
|                                                                                  tLogRow_5                        |
|=----------------------------------------------------+------------------------------------------------------------=|
|result                                               |result1                                                      |
|=----------------------------------------------------+------------------------------------------------------------=|
|ERROR: relation "ad_apps_dependencies" already exists|CREATE VIEW SYSTEM.AD_APPS   AS SELECT * FROM APPLSYS.AD_APPS|
'-----------------------------------------------------+-------------------------------------------------------------'

.------------.
| tLogRow_6  |
|=----------=|
|subjobError |
|=----------=|
|Subjob Error|
'------------'

I want to have this output 3 times, as I try to run 3 times the query. (In the final version I'd create another file that contains only the queries that failed, so we can fix it later, but this is a crucial point.)

As a workaround: I could move this tRow -> OnError -> FixedFlow -> FailedViews part into a different job, but thats not an elegant solution.


Solution

  • Do it with tJavaFlex. It works for Greenplumrow as well!!! In the below exmaple I execute a query in tOracleInput_6:

    "SELECT count(*) FROM "+((String)globalMap.get("ora_sch.SCHEMA_NAME")) + "." + ((String)globalMap.get("ora_tab.TABLE_NAME"))
    

    I am not sure about query result. It might fail (eg:someone just dropped the table) therefore i use tJavaFlex+iterate connection! The resulting code generated try {...} catch block.

    In below case i iterate over tables of a schema and count the rows in every table. I collect the exceptions into a java hashmap. (the hashmap is in the context, but that's another story)

    enter image description here

    tJavaFlex Begin code:
    try{
    
    tJavaflex main code:
    // here is the main part of the component,
    // a piece of code executed in the row
    // loop
    System.out.println(((String)globalMap.get("TYPE"))+" SELECT count(*) FROM "+((String)globalMap.get("ora_sch.SCHEMA_NAME")) + "." + ((String)globalMap.get("ora_tab.TABLE_NAME")));
    
    
    tJavaFlex1 End code:
    // end of the component, outside/closing the loop
    } catch (Exception e) {
    //put 
    ((Map<String, String>)context.EXCEPTIONS).put(
      ((String)globalMap.get("TYPE"))+((String)globalMap.get("ora_sch.SCHEMA_NAME"))+"_"
      +((Integer)globalMap.get("tFlowToIterate_8_CURRENT_ITERATION")).toString(),
    ((String)globalMap.get("ora_sch.SCHEMA_NAME"))+","
    + ((String)globalMap.get("ora_tab.TABLE_NAME")));
    
    //print some debug message
    System.out.print(((String)globalMap.get("TYPE"))+" ### SQL Exception at ");
    System.out.print("Iteration Number: "+((Integer)globalMap.get("tFlowToIterate_8_CURRENT_ITERATION")).toString()+","+((String)globalMap.get("ora_sch.SCHEMA_NAME"))+","+ ((String)globalMap.get("ora_tab.TABLE_NAME")));
    System.out.println(" ###");
    //e.printStackTrace();
    }