Search code examples
mongodbcsvmongoimport

MongoDB: Converting field type from String to Array


I used mongoimport to import an csv file. This dataset to be specific:

https://www.google.com/fusiontables/DataSource?docid=1d7Qps--r0i-E4drYQQzntSdXN8xJ4-6qk24CiMed#map:id=3

The problem I am having is with the description field.

[{"landing_point_id":3522,"latlon":"51.898325,-8.472768","name":"Cork, Ireland"}]

I figure this is an array of objects so I am making a mongoose model for it like this:

  description: [{
    landing_point_id: Number,
    latlon: String,
    name: String
        }],

But this gives me an empty array. If I set the type of description to String I do get the values – but of course as a string, so the properties are not accessible.

"description" : "[{\"landing_point_id\":8398,\"latlon\":\"52.207114,1.620294\",\"name\":\"Sizewell, United Kingdom\"}]"

So the problem seem to be that the field description is a String when I would like it to be an Array.

Following answers on here I have tried to convert it from string to array, but with no luck.

db.cables.find().snapshot().forEach(function (el) {
  el.description_array = [ el.description ];
  db.cables.save(el);
});

This just wraps the string in another array.

"description_array" : [ "[{\"landing_point_id\":8398,│    col10: '',
\"latlon\":\"52.207114,1.620294\",\"name\":\"Sizewell, United Kingdom\"}]" ]

And the same for

  el.description_array = new Array(el.description);

Any ideas how to solve this?

Something that could be edited in the csv file before importing to make mongoimport interpret it correctly?


Solution

  • The "string" now needs to be "parsed" into a valid data structure. Also the "latlong" will be useless to you as both a "string" itself and in the wrong order for how MongoDB expects the coordinates.

    So we fix both:

    var bulk = db.cables.initializeOrderedBulkOp(),
        count = 0;
    
    db.cables.find({ "description": { "$type": 2 } }).forEach(function(doc) {
        doc.description = JSON.parse(doc.description);
        doc.description = doc.description.map(function(desc) {
            desc.coordinates = desc.latlon.split(",").reverse().map(function(el) { 
                return parseFloat(el);
            });
            delete desc.latlong;
            return desc;
        });
    
        bulk.find({ "_id": doc._id }).updateOne({
            "$set": { "description": doc.description }
        });
        count++;
    
        // Send batch one in 1000
        if (count % 1000 == 0) {
           bulk.execute();
           bulk = db.cables.initializeOrderedBulkOp();
        }
    });
    
    // Clear any queued
    if ( count % 1000 != 0 )
        bulk.execute();
    

    Change your mongoose schema to this:

    "description": [{
        "landing_point_id": Number,
        "coordinates": [],
        "name": String
    }],
    

    And now you have data you can index and use with GeoSpatial queries.