Search code examples
mongodbrecoveryreplicaset

MongoDB: ***aborting after fassert() failure error during converting a standalone to a replica set


Mongo version: mongo:4.2.6

I'm following manual Convert a Standalone to a Replica Set to run MongoDB in Replica Set Mode.

When I trying launch MongoDB with command mongod --replSet rs0 - I got the next log: ***aborting after fassert() failure

mongodb-hotbot_1    | 2020-11-29T07:54:06.233+0000 I  CONTROL  [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'
mongodb-hotbot_1    | 2020-11-29T07:54:06.235+0000 W  ASIO     [main] No TransportLayer configured during NetworkInterface startup
mongodb-hotbot_1    | 2020-11-29T07:54:06.235+0000 I  CONTROL  [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/data/db 64-bit host=8c7762d33a84
mongodb-hotbot_1    | 2020-11-29T07:54:06.235+0000 I  CONTROL  [initandlisten] db version v4.2.6
mongodb-hotbot_1    | 2020-11-29T07:54:06.235+0000 I  CONTROL  [initandlisten] git version: 20364840b8f1af16917e4c23c1b5f5efd8b352f8
mongodb-hotbot_1    | 2020-11-29T07:54:06.235+0000 I  CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.1.1  11 Sep 2018
mongodb-hotbot_1    | 2020-11-29T07:54:06.235+0000 I  CONTROL  [initandlisten] allocator: tcmalloc
mongodb-hotbot_1    | 2020-11-29T07:54:06.235+0000 I  CONTROL  [initandlisten] modules: none
mongodb-hotbot_1    | 2020-11-29T07:54:06.235+0000 I  CONTROL  [initandlisten] build environment:
mongodb-hotbot_1    | 2020-11-29T07:54:06.235+0000 I  CONTROL  [initandlisten]     distmod: ubuntu1804
mongodb-hotbot_1    | 2020-11-29T07:54:06.235+0000 I  CONTROL  [initandlisten]     distarch: x86_64
mongodb-hotbot_1    | 2020-11-29T07:54:06.235+0000 I  CONTROL  [initandlisten]     target_arch: x86_64
mongodb-hotbot_1    | 2020-11-29T07:54:06.235+0000 I  CONTROL  [initandlisten] options: { net: { bindIp: "*" }, replication: { replSet: "rs0" } }
mongodb-hotbot_1    | 2020-11-29T07:54:06.237+0000 W  STORAGE  [initandlisten] Detected unclean shutdown - /data/db/mongod.lock is not empty.
mongodb-hotbot_1    | 2020-11-29T07:54:06.240+0000 I  STORAGE  [initandlisten] Detected data files in /data/db created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
mongodb-hotbot_1    | 2020-11-29T07:54:06.241+0000 W  STORAGE  [initandlisten] Recovering data from the last clean checkpoint.
mongodb-hotbot_1    | 2020-11-29T07:54:06.242+0000 I  STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=483M,cache_overflow=(file_max=0M),session_max=33000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000,close_scan_interval=10,close_handle_minimum=250),statistics_log=(wait=0),verbose=[recovery_progress,checkpoint_progress],
mongodb-hotbot_1    | 2020-11-29T07:54:06.737+0000 I  STORAGE  [initandlisten] WiredTiger message [1606636446:737260][1:0x7fb7b9e5bb00], txn-recover: Recovering log 548 through 549
mongodb-hotbot_1    | 2020-11-29T07:54:07.233+0000 I  STORAGE  [initandlisten] WiredTiger message [1606636447:233737][1:0x7fb7b9e5bb00], txn-recover: Recovering log 549 through 549
mongodb-hotbot_1    | 2020-11-29T07:54:07.735+0000 I  STORAGE  [initandlisten] WiredTiger message [1606636447:735489][1:0x7fb7b9e5bb00], txn-recover: Main recovery loop: starting at 548/256 to 549/256
mongodb-hotbot_1    | 2020-11-29T07:54:07.739+0000 I  STORAGE  [initandlisten] WiredTiger message [1606636447:739240][1:0x7fb7b9e5bb00], txn-recover: Recovering log 548 through 549
mongodb-hotbot_1    | 2020-11-29T07:54:07.792+0000 I  STORAGE  [initandlisten] WiredTiger message [1606636447:792369][1:0x7fb7b9e5bb00], txn-recover: Recovering log 549 through 549
mongodb-hotbot_1    | 2020-11-29T07:54:07.826+0000 I  STORAGE  [initandlisten] WiredTiger message [1606636447:826673][1:0x7fb7b9e5bb00], txn-recover: Set global recovery timestamp: (1598047236, 1)
mongodb-hotbot_1    | 2020-11-29T07:54:07.849+0000 I  RECOVERY [initandlisten] WiredTiger recoveryTimestamp. Ts: Timestamp(1598047236, 1)
mongodb-hotbot_1    | 2020-11-29T07:54:07.890+0000 I  STORAGE  [initandlisten] Starting OplogTruncaterThread local.oplog.rs
mongodb-hotbot_1    | 2020-11-29T07:54:07.890+0000 I  STORAGE  [initandlisten] The size storer reports that the oplog contains 1372748 records totaling to 335339428 bytes
mongodb-hotbot_1    | 2020-11-29T07:54:07.890+0000 I  STORAGE  [initandlisten] Sampling the oplog to determine where to place markers for truncation
mongodb-hotbot_1    | 2020-11-29T07:54:07.901+0000 I  STORAGE  [initandlisten] Sampling from the oplog between Aug  6 11:40:28:1 and Aug 22 01:00:20:2 to determine where to place markers for truncation
mongodb-hotbot_1    | 2020-11-29T07:54:07.901+0000 I  STORAGE  [initandlisten] Taking 24 samples and assuming that each section of oplog contains approximately 554226 records totaling to 135388162 bytes
mongodb-hotbot_1    | 2020-11-29T07:54:07.992+0000 I  STORAGE  [initandlisten] Placing a marker at optime Feb 17 12:37:42:1
mongodb-hotbot_1    | 2020-11-29T07:54:07.992+0000 I  STORAGE  [initandlisten] Placing a marker at optime Jun 16 13:50:38:524
mongodb-hotbot_1    | 2020-11-29T07:54:07.992+0000 I  STORAGE  [initandlisten] WiredTiger record store oplog processing took 101ms
mongodb-hotbot_1    | 2020-11-29T07:54:07.995+0000 I  STORAGE  [initandlisten] Timestamp monitor starting
mongodb-hotbot_1    | 2020-11-29T07:54:07.997+0000 I  CONTROL  [initandlisten]
mongodb-hotbot_1    | 2020-11-29T07:54:07.997+0000 I  CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
mongodb-hotbot_1    | 2020-11-29T07:54:07.997+0000 I  CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
mongodb-hotbot_1    | 2020-11-29T07:54:07.997+0000 I  CONTROL  [initandlisten]
mongodb-hotbot_1    | 2020-11-29T07:54:08.057+0000 I  SHARDING [initandlisten] Marking collection local.system.replset as collection version: <unsharded>
mongodb-hotbot_1    | 2020-11-29T07:54:08.074+0000 I  STORAGE  [initandlisten] Flow Control is enabled on this deployment.
mongodb-hotbot_1    | 2020-11-29T07:54:08.074+0000 I  SHARDING [initandlisten] Marking collection admin.system.roles as collection version: <unsharded>
mongodb-hotbot_1    | 2020-11-29T07:54:08.074+0000 I  SHARDING [initandlisten] Marking collection admin.system.version as collection version: <unsharded>
mongodb-hotbot_1    | 2020-11-29T07:54:08.083+0000 I  SHARDING [initandlisten] Marking collection local.startup_log as collection version: <unsharded>
mongodb-hotbot_1    | 2020-11-29T07:54:08.084+0000 I  FTDC     [initandlisten] Initializing full-time diagnostic data capture with directory '/data/db/diagnostic.data'
mongodb-hotbot_1    | 2020-11-29T07:54:08.085+0000 I  SHARDING [initandlisten] Marking collection local.replset.minvalid as collection version: <unsharded>
mongodb-hotbot_1    | 2020-11-29T07:54:08.086+0000 I  SHARDING [initandlisten] Marking collection local.replset.election as collection version: <unsharded>
mongodb-hotbot_1    | 2020-11-29T07:54:08.096+0000 I  REPL     [initandlisten] Rollback ID is 1
mongodb-hotbot_1    | 2020-11-29T07:54:08.096+0000 F  REPL     [initandlisten] This instance has been repaired and may contain modified replicated data that would not match other replica set members. To see your repaired data, start mongod without the --replSet option. When you are finished recovering your data and would like to perform a complete re-sync, please refer to the documentation here: https://docs.mongodb.com/manual/tutorial/resync-replica-set-member/
mongodb-hotbot_1    | 2020-11-29T07:54:08.096+0000 F  -        [initandlisten] Fatal Assertion 50923 at src/mongo/db/repl/replication_coordinator_impl.cpp 527
mongodb-hotbot_1    | 2020-11-29T07:54:08.096+0000 F  -        [initandlisten]
mongodb-hotbot_1    |
mongodb-hotbot_1    | ***aborting after fassert() failure
mongodb-hotbot_1    |
mongodb-hotbot_1    |

Here is the message in the end of the log before the aborting process:

This instance has been repaired and may contain modified replicated data that would not match other replica set members. To see your repaired data, start mongod without the --replSet option. When you are finished recovering your data and would like to perform a complete re-sync, please refer to the documentation here: https://docs.mongodb.com/manual/tutorial/resync-replica-set-member/

So I tried to repair my data by following the next manual Recover a Standalone after an Unexpected Shutdown

After that in the end I got the empty file mongod.lock in my dbPath directory, as I understand it a signal that recovering was finished successfully.

Then I launch mongodb in standalone mode and everything worked fine, so I gracefully shutdown it then, and double checked that mongod.lock file still empty after.

Finally I tried to launch command mongod --replSet rs0 with empty mongod.lock but again I'm getting the same error and mongod.lock file updates with 1 on the first line after...

Any ideas how can I fix this issue and launch mongo in replica set mode with my data?


Solution

  • Here is what I have done to solve the issue:

    1. Launch a repair mongod process with my data directory - mongod --dbpath /data/db --repair
    2. Launch a standalone mongod with my data directory – mongod --dbpath /data/db
    3. Make dump – mongodump --host=localhost --port=27017 --out=/tmp/dumps/1
    4. Connect to db locally via mongo shell and shutdown it gracefully – mongo "mongodb://localhost:27017/admin" & db.shutdownServer()
    5. Run new mongod in RS mode with new clean data directory – mongod --dbpath /data/db_recovered --replSet rs0
    6. Connect to db locally in RS mode via mongo shell and run commands – mongo "mongodb://localhost:27017/admin" & rs.initiate() & db.isMaster()
    7. Restore data from dump – mongorestore --host=localhost --port=27017 /tmp/dumps/1
    8. Finally I got the mognod running in RS mode with my data 🥳