A two node distributed OrientDB system, embedded mode, using TCP-IP for node discovery. The class event
is sharded on four clusters. After restarting one node, exactly half of the inserts on that node fail with the error message:
INFO Local node 'orientdb-lab-node2' is not the owner for cluster 'event_1' (it is 'orientdb-lab-node1'). Reloading distributed configuration for database 'test-db' [ODistributedStorage]
and the stack trace:
com.orientechnologies.orient.server.distributed.ODistributedConfigurationChangedException: Local node 'orientdb-lab-node2' is not the owner for cluster 'event_1' (it is 'orientdb-lab-node1')
DB name="test-db"
DB name="test-db"
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.orientechnologies.orient.client.binary.OChannelBinaryAsynchClient.throwSerializedException(OChannelBinaryAsynchClient.java:437)
at com.orientechnologies.orient.client.binary.OChannelBinaryAsynchClient.handleStatus(OChannelBinaryAsynchClient.java:388)
at com.orientechnologies.orient.client.binary.OChannelBinaryAsynchClient.beginResponse(OChannelBinaryAsynchClient.java:270)
at com.orientechnologies.orient.client.binary.OChannelBinaryAsynchClient.beginResponse(OChannelBinaryAsynchClient.java:162)
at com.orientechnologies.orient.client.remote.OStorageRemote.beginResponse(OStorageRemote.java:2138)
at com.orientechnologies.orient.client.remote.OStorageRemote$6.execute(OStorageRemote.java:548)
at com.orientechnologies.orient.client.remote.OStorageRemote$6.execute(OStorageRemote.java:542)
at com.orientechnologies.orient.client.remote.OStorageRemote$1.execute(OStorageRemote.java:164)
at com.orientechnologies.orient.client.remote.OStorageRemote.baseNetworkOperation(OStorageRemote.java:235)
at com.orientechnologies.orient.client.remote.OStorageRemote.asyncNetworkOperation(OStorageRemote.java:156)
at com.orientechnologies.orient.client.remote.OStorageRemote.createRecord(OStorageRemote.java:528)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.executeSaveRecord(ODatabaseDocumentTx.java:2095)
at com.orientechnologies.orient.core.tx.OTransactionNoTx.saveNew(OTransactionNoTx.java:246)
at com.orientechnologies.orient.core.tx.OTransactionNoTx.saveRecord(OTransactionNoTx.java:179)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.save(ODatabaseDocumentTx.java:2597)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.save(ODatabaseDocumentTx.java:103)
at com.orientechnologies.orient.core.record.impl.ODocument.save(ODocument.java:1802)
at com.orientechnologies.orient.core.record.impl.ODocument.save(ODocument.java:1793)
at lab.orientdb.OrientDbClient.insert(OrientDbClient.java:10)
at lab.orientdb.Main.main(Main.java:24)
This is what the cluster configuration looks like from node1:
Node 1 and 2 running, 10 inserts on each node
CLUSTERS (collections)
+----+-----------+----+---------+-----------------+-----+------------------+--------------------+--------------------+
|# |NAME | ID|CLASS |CONFLICT-STRATEGY|COUNT| OWNER_SERVER | OTHER_SERVERS |AUTO_DEPLOY_NEW_NODE|
+----+-----------+----+---------+-----------------+-----+------------------+--------------------+--------------------+
|5 |event | 17|event | | 8|orientdb-lab-node2|[orientdb-lab-node1]| true |
|6 |event_1 | 18|event | | 3|orientdb-lab-node1|[orientdb-lab-node2]| true |
|7 |event_2 | 19|event | | 2|orientdb-lab-node1|[orientdb-lab-node2]| true |
|8 |event_3 | 20|event | | 7|orientdb-lab-node2|[orientdb-lab-node1]| true |
+----+-----------+----+---------+-----------------+-----+------------------+--------------------+--------------------+
| |TOTAL | | | | 20| | | |
+----+-----------+----+---------+-----------------+-----+------------------+--------------------+--------------------+
Node 2 stopped
CLUSTERS (collections)
+----+-----------+----+---------+-----------------+-----+------------------+--------------------+--------------------+
|# |NAME | ID|CLASS |CONFLICT-STRATEGY|COUNT| OWNER_SERVER | OTHER_SERVERS |AUTO_DEPLOY_NEW_NODE|
+----+-----------+----+---------+-----------------+-----+------------------+--------------------+--------------------+
|5 |event | 17|event | | 8|orientdb-lab-node1|[orientdb-lab-node2]| true |
|6 |event_1 | 18|event | | 3|orientdb-lab-node1|[orientdb-lab-node2]| true |
|7 |event_2 | 19|event | | 2|orientdb-lab-node1|[orientdb-lab-node2]| true |
|8 |event_3 | 20|event | | 7|orientdb-lab-node1|[orientdb-lab-node2]| true |
+----+-----------+----+---------+-----------------+-----+------------------+--------------------+--------------------+
| |TOTAL | | | | 20| | | |
+----+-----------+----+---------+-----------------+-----+------------------+--------------------+--------------------+
Node 2 restarted, 5 successful inserts and 5 failed
CLUSTERS (collections)
+----+-----------+----+---------+-----------------+-----+------------------+--------------------+--------------------+
|# |NAME | ID|CLASS |CONFLICT-STRATEGY|COUNT| OWNER_SERVER | OTHER_SERVERS |AUTO_DEPLOY_NEW_NODE|
+----+-----------+----+---------+-----------------+-----+------------------+--------------------+--------------------+
|5 |event | 17|event | | 11|orientdb-lab-node2|[orientdb-lab-node1]| true |
|6 |event_1 | 18|event | | 3|orientdb-lab-node1|[orientdb-lab-node2]| true |
|7 |event_2 | 19|event | | 2|orientdb-lab-node1|[orientdb-lab-node2]| true |
|8 |event_3 | 20|event | | 9|orientdb-lab-node2|[orientdb-lab-node1]| true |
+----+-----------+----+---------+-----------------+-----+------------------+--------------------+--------------------+
| |TOTAL | | | | 25| | | |
+----+-----------+----+---------+-----------------+-----+------------------+--------------------+--------------------+
Any tip or advice appreciated. Thanks.
This issue has been resolved on OrientDB 2.2.13-SNAPSHOT, so should be ok in a release version very soon: https://github.com/orientechnologies/orientdb/issues/6897