Search code examples
mongodbspring-mongo

Occasional read failures after create in MongoDB


I'm using Spring MongoTemplate to integrate with a MongoDB instance in my java application. I'm running mongo version 2.4.5 and spring-data-mongodb 1.2.3-RELEASE.

I'm running mongoDB in a 3-node replica set, no sharding.

I have data creation code which calls the following two operations, sequentially, on the same thread, with WriteConcern=ACKNOWLEDGED:

mongoTemplate.insert(entity);
savedEntity = mongoTemplate.findById(entity.getId(), entity.getClass());

I run this application successfully in a few different environments, but in one environment, savedEntity is occasionally (maybe 1 in 100 executions) getting assigned a null value. The data is persisted successfully by the insert. I've been able to set a breakpoint conditional on savedEntity == null, and when I hit that breakpoint and force that findById to run again via my IDE, it returns the expected result (not null).

Logging indicates that these operations happen in quick succession on the same thread (create 5):

2015-01-12 18:32:13,796 DEBUG [create 5] org.springframework.data.mongodb.core.MongoTemplate: Inserting DBObject containing fields: [_class, _id, guid, updated, added, version] in collection: persistentEntity
2015-01-12 18:32:13,798 DEBUG [create 5] org.springframework.data.mongodb.core.MongoTemplate: findOne using query: { "_id" : 4660192} in db.collection: MyDatabase.persistentEntity

It seems to me that the read operation is occurring before the data has been "fully" persisted, and so no matching object is found. But doesn't write atomicity mean this should not happen?

I was worried that my read was going to a stale secondary (since I'm not waiting for replication on my write) so I re-configured my mongoTemplate to only have the primary node in its config, but the problem does not go away.

Any answers, clarification on mongo write-then-read behavior, or troubleshooting tips would be appreciated.


Solution

  • The root cause of this issue was due to the fact that I was running a 3-node replica set, and my application was using readPreference=NEAREST. This meant that I could potentially read from a secondary node, after my write, but before the data was replicated to the secondary. For this application, readPreference=PRIMARY is required.