Search code examples
hadoopapache-pigbigdataparentheses

Pig removing parentheses when storing output


I'm new in programming Pig and currently I'm trying to implement my Hadoop jobs with pig. So far my Pig programs work. I've got some output files stored as *.txt with semicolon as delimiter. My problem is that Pig adds parentheses around the tuple's...

Is it possible to store the output in a file without these parentheses? Only storing the values? Maybe by overwriting the PigStorage method with an UDF? Does anyone have a hint for me?

I want to read my output files into a RDBMS (Oracle) without the parentheses.


Solution

  • You probably need to write your own custom Storer. See: http://wiki.apache.org/pig/Pig070LoadStoreHowTo.

    Shouldn't be too difficult to just write it as a plain CSV or whatever. There's also a pre-existing DBStorage class that you might be able to use to write directly to Oracle if you want.