Search code examples
jsonmongodbgzipmongoimport

Fastest way to import a JSON (1 GB) into in MongoDB


I want to import this json file in upsert mode in mongoDB.

File: http://bulk.openweathermap.org/sample/daily_16.json.gz

This file is almost 1GB (the compressed version is 90MB as you can see in the hyperlinked file).

Exporting this zip file and importing the 1GB takes >30 minutes. 25% percent has been imported as of typing this question, and it has already taken 20 minutes. Which is too time consuming, is there any faster way to do this?

.\mongoimport.exe --uri="mongodb://localhost/openWeather" --collection=openWeatherData --mode=upsert --upsertFields=city.id --file="daily_16.json"


Solution

  • Create an index on city.id:

    db.runCommand({
       createIndexes: 'openWeatherData',
       indexes: [{
          key: { "city.id": 1 },
          name: "city_id"
       }]
    })
    

    and use parameter --numInsertionWorkers

    On my machine the import took 1-2 Minutes.