I have a csv file which contains ["] if the value for that attribute is null. For example:
a,b,c
1,2,3
4,[""],6
7,8,9
I have two RHEL systems with different Mongo versions: 2.4.1 and 3.4 When I run this command:
time mongoimport --host hostname:27017 --db test --collection abc --file example.csv --type csv
This throws an error in version 3.4:
bare " in non-quoted-field
It works fine in 2.4.1 though!!! I need it to work on 3.4. Can anyone suggest a solution? the CSV files I have are huge and I dont want to alter the data.
MongoImport 3.4 documentation says :
The csv parser accepts that data that complies with RFC RFC 4180. As a result, backslashes are not a valid escape character. If you use double-quotes to enclose fields in the CSV data, you must escape internal double-quote marks by prepending another double-quote.
Hence, in your case, i.e. [""]
, it considered the first quote to be escape character due to which it is left with a bare "
and due to which failed. To escape the double-quote, you can use it like [""""]
. Here each double quote escapes the next double quote
To fix this, you have the following two options :
[""]
just denotes an empty field, you may directly replace this pattern with an empty string. Mongoimport
will not create a field for the empty values in the corresponding doc.EDIT :
As discussed, it is used to represent a null value, then you can totally replace this string using sed like
sed 's/\[""\]//g' FILE-NAME