how to create and refresh index in pymongo to speed up update queries. As mentioned in the article[1] section, the following is code works fine for small set of entries
self.collection.update({'url': item['url']}, dict(item), upsert=True)
But once it reaches in tens of thousands, it is very slow.
[1] https://realpython.com/web-scraping-and-crawling-with-scrapy-and-mongodb/#mongodb
Create an index on url
field
https://docs.mongodb.com/manual/indexes/
self.collection.create_index('url')
In your case url
will be unique, you can create a unique index.
https://docs.mongodb.com/manual/core/index-unique/#unique-indexes
self.collection.create_index('url', unique = True)
Note- If you've huge existing data create the index in the background