I have a collection with 1 million documents. Each document has an ip field. I have a node function which can return me the country code by passing ip as its param. I was going to get all the records, run my node function, insert the returned country name back to the documents. and update them at once. however, mongodb has a limit on 16M data.
Before
{
_id: xxxxx,
ip: '207.97.227.239'
}
After
{
_id: xxxxx,
ip: '207.97.227.239',
country_abbr: 'US'
}
my question is how I can safely and quickly update these 1 million records.
I am assuming that you will set up the country_abbr
field depending upon the value of ip
. So i think you will need an update command that checks for ip
and sets the value for country_abbr
. This is how you should do it:
db.collection.update (
{ip : condition_for_ip},
{$set : {country_abbr : desired_value}},
{multi : true}
);
You will need to run this query multiple times to cover all the countries possible in your collection.
To check if there is some document left in your collection without any value for country_abbr
, you can run the following query:
db.collection.find({'country_abbr' : {$exists : false}});
If the above written find query returns any document, you can read the ip
and see which more countries you need to add.
Edit after clarification:
The returning documents is too large and crosses the 16MB limit in your case. So what you do is that you fetch only the ip
and store them all in a linked list. Then you iterate through the list and using the magical node
function that you have, you get the correct country_abbr
value. Lastly you issue a simple update to mongo the way written above.
To generalize my answer for everyone, instead of the function that the person who asked this question has, you can do a manual typing or whatever you want and supply the country_abbr
value for the update command above.