Search code examples
javaibm-integration-busextended-sql

Delete a single line if a specific string is found


I'm fairly new with IIB, currently what I want to achieve is to delete a single line from a TXT if it contains a specific word for example the word USA as per below.I read it as a BLOB then convert it to a string. Should I achieve this using Compute node or Java node? Thanks in advance.

e.g

Before

Hello my name 
is Malcom and I live
in the USA

After

Hello my name 
is Malcom and I live

Current Flow FileInput -> Compute -> JavaCompute -> FileOutput

FileInput : To read data from a specific folder

Compute : Replace a string to another string (mask)

CREATE PROCEDURE getBLOBMessage() BEGIN
        DECLARE fullBLOB CHARACTER;
        SET fullBLOB = CAST(OutputRoot.BLOB.BLOB as char CCSID 1208 Encoding 815);
        SET OutputLocalEnvironment.msg = fullBLOB;
    END;
    
    CREATE PROCEDURE maskMessage(INOUT msg CHARACTER) BEGIN
         SET msg = REPLACE (msg, '431.111.55.113', 'XXX.XXX.XX.XXX');
         SET msg = REPLACE (msg, '111.115.11.112', 'XXX.XXX.XX.XXX');
         SET msg = REPLACE (msg, '111.112.11.112', 'XXX.XXX.XX.XXX');
         SET msg = REPLACE (msg, '111.111.111.116', 'XXX.XXX.XXX.XXX');
         SET msg = REPLACE (msg, '172.16.18.72', 'XXX.XX.XX.XX');
         SET msg = REPLACE (msg, 'b1111111110', 'XXXXXXXXXXX');
         SET msg = REPLACE (msg, '11111111101', 'XXXXXXXXXXX');
         SET msg = REPLACE (msg, '11111111111', 'XXXXXXXXXXX');
         SET msg = REPLACE (msg, 'B1111111111', 'XXXXXXXXXXX');
         SET msg = REPLACE (msg, 'Q1111111', 'XXXXXXXX');
         SET msg = REPLACE (msg, '11111111111N', 'XXXXXXXXXXXX'); 
         SET OutputRoot.BLOB.BLOB = CAST (msg AS BLOB CCSID 1208 Encoding 815);
    END;

JavaCompute: For removing line maybe?

FileOutput: To generate the output txt file


Solution

  • If you use the Record detection feature of the File Input node, your requirements can be fulfilled in ESQL.

    FileInput node:

    • Records and Elements: Record detection = Delimited
    • Connect End of Data to Finish File of FileOutput node

    Compute node:

    CREATE COMPUTE MODULE Thaqif_Compute
    
        CREATE FUNCTION Main() RETURNS BOOLEAN
        BEGIN
            SET OutputRoot = InputRoot;
            DECLARE line CHARACTER CAST(OutputRoot.BLOB.BLOB AS CHAR
                                        CCSID InputProperties.CodedCharSetId
                                        ENCODING InputProperties.Encoding);
            IF CONTAINS(line, 'USA') THEN
                RETURN FALSE;
            ELSE
                CALL maskMessage(line);
                SET OutputRoot.BLOB.BLOB = CAST(line AS BLOB 
                                                CCSID InputProperties.CodedCharSetId
                                                ENCODING InputProperties.Encoding);
                RETURN TRUE;
            END IF;
        END;
    
        CREATE PROCEDURE maskMessage(INOUT msg CHARACTER) BEGIN
            SET msg = REPLACE (msg, '431.111.55.113', 'XXX.XXX.XX.XXX');
            -- Other patterns removed for brevity
            SET msg = REPLACE (msg, 'Q1111111', 'XXXXXXXX');
        END;
    
    END MODULE;
    

    FileOutput node:

    • Records and Elements: Record definition = Record is delimited data

    Example input:

    Hello my name 
    is Malcom and I live
    in the USA
    where 431.111.55.113 is masked
    but Q2222222 is still ok
    

    Resulting output:

    Hello my name 
    is Malcom and I live
    where XXX.XXX.XX.XXX is masked
    but Q2222222 is still ok