Search code examples
javamongodbmongo-javamongo-java-driver

Mongo JAVA driver-3.6 iterator() iterating over a single document again and again


I am trying to update every documents "Name" field in the collection using the iterator() defined in the FindIterable interface in Java Mongo driver. The next() function on the iterator should give me the next BSON object, but actually, it's iterating over the same object.

public void func1(FindIterable<org.bson.Document> documents, MongoCollection<org.bson.Document> coll_name) {
    /*
     * This function will run until all the documents in a collection aren't retrieved.
     * */
    try {
        while (documents.iterator().hasNext()) {
           coll_name.updateOne(eq("_id", documents.iterator().next().get("_id")), new org.bson.Document("$set", new org.bson.Document("Name", getName((String) documents.iterator().next().get("NameSource")))));
            System.out.println("Document _id " + documents.iterator().next().get("_id") + " updated.....! in time : " + df.format(dater));
        }
    }catch (Exception ex){
        System.out.println(" ~~~~~~~~~~~Was not able to getName() & update document~~~~~~~~~~~~~~~~~~");
        System.out.println(ex.getMessage()) ;
    } finally {
        documents.iterator().close();
    }
}

Call to the function:

  FindIterable<org.bson.Document> FRESH_docs = coll_name.find(exists("Text", false)).noCursorTimeout(true);

    int flag = 1;
    try{
        func1(FRESH_docs, coll_name, flag);
    }catch (Exception ex){
        System.out.println(" ~~~~~~~~~~~call to func1() failed for FRESH_docs ~~~~~~~~~~~~~~~~~~");
        System.out.println(ex.getMessage());
    }

The output is:

Document _id 4713205 updated.....! in time : 2017-12-25 08:56:42.876
Document _id 4713205 updated.....! in time : 2017-12-25 08:56:42.902
Document _id 4713205 updated.....! in time : 2017-12-25 08:56:42.930
Document _id 4713205 updated.....! in time : 2017-12-25 08:56:42.958
Document _id 4713205 updated.....! in time : 2017-12-25 08:56:42.984
Document _id 4713205 updated.....! in time : 2017-12-25 08:56:43.012
.....

I removed the date-time printers for clean code evaluation. Can anyone suggest what's the mistake I am doing that's iterating over the same BSON document?


Solution

  • There are a few issues in the way you're processing the cursor:

    1. Call documents.iterator() once. After looking at the source code, it seems that you get a new iterator on this call. So you may be just restarting the exercise every time you want to advance to the new iteration:

      MongoCursor<org.bson.Document> iterator = documents.iterator();
      while (iterator.hasNext()) {
         // process
      }
      
    2. You're calling iterator.next() multiple times for a single iteration watching iterator.hasNext(). This is problematic and you'll eventually call next() when the cursor has been exausted. Suggested change:

      //Call next once in the iteration and reuse the doc:
      while (iterator.hasNext()) {
         org.bson.Document nextDocument = iterator.next();
         coll_name.updateOne(eq("_id", nextDocument.get("_id")), new org.bson.Document("$set", new org.bson.Document("Name", getName((String) nextDocument.get("NameSource")))));
          System.out.println("Document _id " + nextDocument.get("_id") + " updated.....! in time : " + df.format(dater));
      }
      

    As you can see, point 2 is building on point 1.