Search code examples
couchdbreplicationconnection-timeout

CouchDB 1.6 continuous replication: how to configure resume time after connection failure


I've got a network with many different nodes each running a different CouchDB instance. Each instance has document(s) in _replicator database to setup a continuos (master-master) replication with some other nodes of the network. These nodes can go offline for an undefined, not foreseeable time interval. Everything works great with replication but I'm coming across a problem with the timeout when a node goes offline.

CouchDB documentation says (http://docs.couchdb.org/en/1.6.1/replication/replicator.html)

"When you PUT/POST a document to the _replicator database, CouchDB will attempt to start the replication up to 10 times (configurable under [replicator], parameter max_replication_retry_count). If it fails on the first attempt, it waits 5 seconds before doing a second attempt. If the second attempt fails, it waits 10 seconds before doing a third attempt. If the third attempt fails, it waits 20 seconds before doing a fourth attempt (each attempt doubles the previous wait period)."

When a replication target node goes offline (and thus replication fails) the log file says

Restarting replication in 5 seconds. Restarting replication in 10 seconds. Restarting replication in 20 seconds. Restarting replication in 40 seconds. Restarting replication in 80 seconds. Restarting replication in 160 seconds. Restarting replication in 320 seconds. Restarting replication in 600 seconds. Restarting replication in 600 seconds... (600 seconds seems to be the max timeout)

I need to speed up how quickly replication resumes after a lost connection.

Is this value hardcoded in CouchDB sources?

Is there a parameter to redefine/override the 600 interval to something else?


Solution

  • You can change neither the initial delay nor the max delay in CouchDB 1.x. Their respective values (2.5 seconds [immediately multiplied by 2 giving 5 seconds] and 600 seconds) are hardcoded in source file couch_replicator_manager.erl and cannot be modified unless you modify CouchDB itself.