I'm writing text to a file directly to HDFS using HPL/SQL's UTL_FILE function PUT_LINE(). Each line in the file consists of several text fields delimited by a semicolon.
Note:
Any ideas as to how to read in the external data while in this state, or prevent it being written out via PUT_LINE() in this state?
I haven't found a way to use the serialization.encoding option to fix the issue, but two workarounds:
Use PRINT() or DBMS_OUTPUT.PUT_LINE() to write out a text string to the Linux file system and then push it over to HDFS.
Use REGEXP_REPLACE to remove the null characters in each column:
regexp_replace(column-name,'\x00','')