Search code examples
sqlpattern-matchingteradataflat-file

columns manipulation in fast load


Hello i am new to teradata. I am loading flat file into my TD DB using fast load. My data set(CSV FILE) contains some issues like some of the rows in city column contains proper data but some of the rows contains NULL. The values of the city columns which contains NULL are stored into the next column which is zip code and so on. At the end some of the rows contains extra columns due to the extra NULL in rows. Examples is given below. How to resolve these kind of issues in fastload? Can someone answer this with SQL example?

 City    Zipcode                 country    
 xyz     12                       Esp
 abc     11                       Ger
 Null    def(city's data)         12(zipcode's data)         Por(country's data)  

Solution

  • What about different approach. Instead of solving this in fast load, load your data to temporary table like DATABASENAME.CITIES_TMP with structure like below

    City | zip_code | country | column4
    xyz  | 12       | Esp     | 
    NULL | abc      | 12      | Por
    

    In next step create target table DATABASENAME.CITY with the structure

    City | zip_code | country |
    

    As a final step you need to run 2 INSERT queries:

    INSERT INTO DATABASENAME.CITY (City, zip_code, country)
    SELECT City, zip_code, country FROM DATABASENAME.CITIES_TMP
    WHERE CITY not like 'NULL'/* WHERE CITY is not null - depends if null is a text value or just empty cell*/;
    
    INSERT INTO DATABASENAME.CITY (City, zip_code, country)
    SELECT Zip_code, country, column4 FROM DATABASENAME.CITIES_TMP
    WHERE CITY like 'NULL' /* WHERE CITY is null - depends if null is a text value or just empty cell*/
    

    Of course this will work if all your data looks exacly like in sample you provide. This also will work only when you need to do this once in a while. If you need to load data few times a day it will be a litte cumbersome (not sure if I used proper word in this context) and then you should build some kind of ETL process with for example Talend tool.