Search code examples
mongodbmongorestorebindata

error when using mongorestore to replay oplog with binData field


When using mongorestore with option --oplogReplay to replay oplogs, I found a strange error that mongorestore cannot handle binData field's set operation. You maybe meet the same error if you do this:

  1. insert a test data.

    db.testData.insert({_id: 10000, data: BinData(0, ""), size: 10})
    
  2. update its binData field.

    db.testData.update({_id: 10000}, {$set: {data: BinData(0, "CgxVfs93PiT/DrxMSvASFgoNMTAuMTYwLjIyMi4xMhDEJxgKIAA=")}})
    
  3. update its other field

    db.testData.update({_id: 10000}, {$set: {size: 20}})
    
  4. check with oplog

    use local
    
    db.oplog.rs.find().sort({$natural: -1})
    

    you may see the following response:

    { "ts" : Timestamp(1435627154, 1), "h" : NumberLong("-4979206321598144076"), "v" : 2, "op" : "u", "ns" : "test.testData", "o2" : { "_id" : 10000 }, "o" : { "$set" : { "size" : 20 } } }
    { "ts" : Timestamp(1435627144, 1), "h" : NumberLong("2899524097634687825"), "v" : 2, "op" : "u", "ns" : "test.testData", "o2" : { "_id" : 10000 }, "o" : { "$set" : { "data" : BinData(0,"CgxVfs93PiT/DrxMSvASFgoNMTAuMTYwLjIyMi4xMhDEJxgKIAA=") } } }
    { "ts" : Timestamp(1435627136, 1), "h" : NumberLong("-8486373688715225152"), "v" : 2, "op" : "i", "ns" : "test.testData", "o" : { "_id" : 10000, "data" : BinData(0,""), "size" : 10 } }
    
  5. dump these two oplog and replay it

    In bash shell:

    mongodump --port 27017 -d local -c oplog.rs --query '{"ts" : {$gte: Timestamp(1435627144, 1)}}' -o ./oplogD/
    
    mv ./oplogD/local/oplog.rs.bson ./oplogR/oplog.bson
    
    mongorestore --port 27017 --oplogReplay ./oplogR/
    

    after this you would find data not as expected. In my own, data changes to this.

    { "_id" : 10000, "data" : BinData(0,"ADRAAAAAPiT/DrxMSvASFgoNMTAuMTYwLjIyMi4xMhDEJxgKIAA="), "size" : 20 }
    

    The size field is really correct, but the data field is not correct.

  6. The most strange thing would be this, if you dump only one oplog and replay it, the data would be correct.

    mongodump --port 27017 -d local -c oplog.rs --query '{"ts" : Timestamp(1435627144, 1)}' -o ./tmpD/
    
    mv ./tmpD/local/oplog.rs.bson ./tmpR/oplog.bson
    
    mongorestore --port 27017 --oplogReplay ./tmpR/
    

    After oplog replayed, the 'data' field is quite correct.

    { "_id" : 10000, "data" : BinData(0,"CgxVfs93PiT/DrxMSvASFgoNMTAuMTYwLjIyMi4xMhDEJxgKIAA="), "size" : 10 }
    

    Why does this strange thing happen?


Solution

  • It was fixed in this commit.

    https://github.com/mongodb/mongo-tools/commit/ed60bbfae7d2b5239bea69f162f0784e17995e91

    Trace the bug report in JIRA.

    https://jira.mongodb.org/browse/TOOLS-807