When using mongorestore with option --oplogReplay to replay oplogs, I found a strange error that mongorestore cannot handle binData field's set operation. You maybe meet the same error if you do this:
insert a test data.
db.testData.insert({_id: 10000, data: BinData(0, ""), size: 10})
update its binData field.
db.testData.update({_id: 10000}, {$set: {data: BinData(0, "CgxVfs93PiT/DrxMSvASFgoNMTAuMTYwLjIyMi4xMhDEJxgKIAA=")}})
update its other field
db.testData.update({_id: 10000}, {$set: {size: 20}})
check with oplog
use local
db.oplog.rs.find().sort({$natural: -1})
you may see the following response:
{ "ts" : Timestamp(1435627154, 1), "h" : NumberLong("-4979206321598144076"), "v" : 2, "op" : "u", "ns" : "test.testData", "o2" : { "_id" : 10000 }, "o" : { "$set" : { "size" : 20 } } }
{ "ts" : Timestamp(1435627144, 1), "h" : NumberLong("2899524097634687825"), "v" : 2, "op" : "u", "ns" : "test.testData", "o2" : { "_id" : 10000 }, "o" : { "$set" : { "data" : BinData(0,"CgxVfs93PiT/DrxMSvASFgoNMTAuMTYwLjIyMi4xMhDEJxgKIAA=") } } }
{ "ts" : Timestamp(1435627136, 1), "h" : NumberLong("-8486373688715225152"), "v" : 2, "op" : "i", "ns" : "test.testData", "o" : { "_id" : 10000, "data" : BinData(0,""), "size" : 10 } }
dump these two oplog and replay it
In bash shell:
mongodump --port 27017 -d local -c oplog.rs --query '{"ts" : {$gte: Timestamp(1435627144, 1)}}' -o ./oplogD/
mv ./oplogD/local/oplog.rs.bson ./oplogR/oplog.bson
mongorestore --port 27017 --oplogReplay ./oplogR/
after this you would find data not as expected. In my own, data changes to this.
{ "_id" : 10000, "data" : BinData(0,"ADRAAAAAPiT/DrxMSvASFgoNMTAuMTYwLjIyMi4xMhDEJxgKIAA="), "size" : 20 }
The size field is really correct, but the data field is not correct.
The most strange thing would be this, if you dump only one oplog and replay it, the data would be correct.
mongodump --port 27017 -d local -c oplog.rs --query '{"ts" : Timestamp(1435627144, 1)}' -o ./tmpD/
mv ./tmpD/local/oplog.rs.bson ./tmpR/oplog.bson
mongorestore --port 27017 --oplogReplay ./tmpR/
After oplog replayed, the 'data' field is quite correct.
{ "_id" : 10000, "data" : BinData(0,"CgxVfs93PiT/DrxMSvASFgoNMTAuMTYwLjIyMi4xMhDEJxgKIAA="), "size" : 10 }
Why does this strange thing happen?
It was fixed in this commit.
https://github.com/mongodb/mongo-tools/commit/ed60bbfae7d2b5239bea69f162f0784e17995e91
Trace the bug report in JIRA.