Search code examples
pythondatabaseloadnetezzaoperation

Load entire text file in database using python


I want to load entire text file having two fields separated by delimiter || into my netezza database. Inserting each n every row degrades my performance since the data set is huge.

My text file is like:

ING BANK, N.V.||520639
STANDARD CHARTERED BANK MALAYSIA BERHAD||483551
VJA||492287

Could you please help in suggesting how can I upload my file in database in a single go?

Thanks.


Solution

  • If you are loading the data to Netezza remotely via an ODBC connection, you can use the following statement as a template. In it, I load from a text file local to my windows workstation using an ODBC connection through a generic SQL development tool.

    INSERT INTO BNK_CDE_TEST
    SELECT *
    FROM external 'c:\bankdata.TXT'
    USING (DELIMITER '|' REMOTESOURCE 'ODBC');
    

    It is important to note that Netezza does not support multiple-character column delimiters. You either need to pre-process your data to convert your '||' delimiter to something else, or you need to perform a trick like the following where we use '|' as the delimiter so that '||' is treated as two delimiters around a junk empty column.

    INSERT INTO BNK_CDE_TEST
    SELECT COL1,   COL2
    FROM external 'c:\bankdata.TXT' (COL1 VARCHAR(100), junkcol INT, col2 bigint)
    USING (DELIMITER '|' REMOTESOURCE 'ODBC');