Search code examples
mongodbmongodb-ruby

Experiencing delay when fetching 100th row


I am fetching all the rows from the collection and experience delay on 100th row. I understand that find method returns cursor and not all the data up front and at certain point need to fetch more data. But the 100th row is the only delay.

Checking images 99
Checking image 100
*pause*
Checking image 101

And then with no visible delay up to 100 000 image.

Used ruby script:

require 'mongo'

time_start = Time.now

mongo = Mongo::MongoClient.new("localhost", 27017)

db = mongo["pics"]

images = db["images"]
albums = db["albums"]

orphans = []

images.find().each do |row|
    puts "Checking image #{row['_id']}"
end

# puts orphans
time_end = Time.now
puts "Total time taken: #{time_end - time_start}"

Used images collection (json)

mongoimport --db pics --collection images file_name

The questions are:

  • does some data come along with the initial cursor?
  • why is the only delay at 100th row? Maybe I've missed something but I don't even see IO reads at that point

Thank you


Solution

  • The default "batch size" of the MongoDB cursor is 100 objects. Means MongoDB fetches 100 objects before fetching the next batch...that is why you see delays. All drivers should provide a method "batch_size()" or similar on the cursor object for setting and retrieving the batch size.