I was trying to export the data from Mongo to Oracle. I used to below approach.
Step 1 : Export the data to CS file usign mongoExport command.
Step 2 : Read the data through a java code, do the necessary data transformation.
Step 3 : Insert the data to Oracle
Issue is that, when any of the comment section has a new line character ('\n'), the data is moving to next line and java read fails to process the document.
There is a open bug with 10gen for this, JIRA. Has any one faced issue. Is there is a workaround for this ?
As with many formatting nuances in CSV, there is no agreed "standard" for how to handle embedded newline characters in a CSV field.
A common implementation is RFC-4180: Common Format and MIME Type for Comma-Separated Values (CSV) Files, which suggests:
6) Fields containing line breaks (CRLF), double quotes, and commas
should be enclosed in double-quotes.
For example:
"aaa","b CRLF
bb","ccc" CRLF
zzz,yyy,xxx
This is the format that mongoexport
is currently using. If you use a CSV parser compliant with RFC-4180 (eg. SuperCSV as suggested by @evanchooly) it should handle the quoted newlines as expected.
If you need an alternative to the format used by mongoexport
or need more flexibility in your output, you can always write your own export script.