Search code examples
orientdborientdb2.2

Inserts (JavaAPI) fail after restarting node in distributed OrientDB cluster


A two node distributed OrientDB system, embedded mode, using TCP-IP for node discovery. The class event is sharded on four clusters. After restarting one node, exactly half of the inserts on that node fail with the error message:

INFO Local node 'orientdb-lab-node2' is not the owner for cluster 'event_1' (it is 'orientdb-lab-node1'). Reloading distributed configuration for database 'test-db' [ODistributedStorage]

and the stack trace:

com.orientechnologies.orient.server.distributed.ODistributedConfigurationChangedException: Local node 'orientdb-lab-node2' is not the owner for cluster 'event_1' (it is 'orientdb-lab-node1')
    DB name="test-db"
    DB name="test-db"
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at com.orientechnologies.orient.client.binary.OChannelBinaryAsynchClient.throwSerializedException(OChannelBinaryAsynchClient.java:437)
    at com.orientechnologies.orient.client.binary.OChannelBinaryAsynchClient.handleStatus(OChannelBinaryAsynchClient.java:388)
    at com.orientechnologies.orient.client.binary.OChannelBinaryAsynchClient.beginResponse(OChannelBinaryAsynchClient.java:270)
    at com.orientechnologies.orient.client.binary.OChannelBinaryAsynchClient.beginResponse(OChannelBinaryAsynchClient.java:162)
    at com.orientechnologies.orient.client.remote.OStorageRemote.beginResponse(OStorageRemote.java:2138)
    at com.orientechnologies.orient.client.remote.OStorageRemote$6.execute(OStorageRemote.java:548)
    at com.orientechnologies.orient.client.remote.OStorageRemote$6.execute(OStorageRemote.java:542)
    at com.orientechnologies.orient.client.remote.OStorageRemote$1.execute(OStorageRemote.java:164)
    at com.orientechnologies.orient.client.remote.OStorageRemote.baseNetworkOperation(OStorageRemote.java:235)
    at com.orientechnologies.orient.client.remote.OStorageRemote.asyncNetworkOperation(OStorageRemote.java:156)
    at com.orientechnologies.orient.client.remote.OStorageRemote.createRecord(OStorageRemote.java:528)
    at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.executeSaveRecord(ODatabaseDocumentTx.java:2095)
    at com.orientechnologies.orient.core.tx.OTransactionNoTx.saveNew(OTransactionNoTx.java:246)
    at com.orientechnologies.orient.core.tx.OTransactionNoTx.saveRecord(OTransactionNoTx.java:179)
    at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.save(ODatabaseDocumentTx.java:2597)
    at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.save(ODatabaseDocumentTx.java:103)
    at com.orientechnologies.orient.core.record.impl.ODocument.save(ODocument.java:1802)
    at com.orientechnologies.orient.core.record.impl.ODocument.save(ODocument.java:1793)
    at lab.orientdb.OrientDbClient.insert(OrientDbClient.java:10)
    at lab.orientdb.Main.main(Main.java:24)

This is what the cluster configuration looks like from node1:

Node 1 and 2 running, 10 inserts on each node

CLUSTERS (collections)
+----+-----------+----+---------+-----------------+-----+------------------+--------------------+--------------------+
|#   |NAME       |  ID|CLASS    |CONFLICT-STRATEGY|COUNT|   OWNER_SERVER   |   OTHER_SERVERS    |AUTO_DEPLOY_NEW_NODE|
+----+-----------+----+---------+-----------------+-----+------------------+--------------------+--------------------+
|5   |event      |  17|event    |                 |    8|orientdb-lab-node2|[orientdb-lab-node1]|        true        |
|6   |event_1    |  18|event    |                 |    3|orientdb-lab-node1|[orientdb-lab-node2]|        true        |
|7   |event_2    |  19|event    |                 |    2|orientdb-lab-node1|[orientdb-lab-node2]|        true        |
|8   |event_3    |  20|event    |                 |    7|orientdb-lab-node2|[orientdb-lab-node1]|        true        |
+----+-----------+----+---------+-----------------+-----+------------------+--------------------+--------------------+
|    |TOTAL      |    |         |                 |   20|                  |                    |                    |
+----+-----------+----+---------+-----------------+-----+------------------+--------------------+--------------------+

Node 2 stopped

CLUSTERS (collections)
+----+-----------+----+---------+-----------------+-----+------------------+--------------------+--------------------+
|#   |NAME       |  ID|CLASS    |CONFLICT-STRATEGY|COUNT|   OWNER_SERVER   |   OTHER_SERVERS    |AUTO_DEPLOY_NEW_NODE|
+----+-----------+----+---------+-----------------+-----+------------------+--------------------+--------------------+
|5   |event      |  17|event    |                 |    8|orientdb-lab-node1|[orientdb-lab-node2]|        true        |
|6   |event_1    |  18|event    |                 |    3|orientdb-lab-node1|[orientdb-lab-node2]|        true        |
|7   |event_2    |  19|event    |                 |    2|orientdb-lab-node1|[orientdb-lab-node2]|        true        |
|8   |event_3    |  20|event    |                 |    7|orientdb-lab-node1|[orientdb-lab-node2]|        true        |
+----+-----------+----+---------+-----------------+-----+------------------+--------------------+--------------------+
|    |TOTAL      |    |         |                 |   20|                  |                    |                    |
+----+-----------+----+---------+-----------------+-----+------------------+--------------------+--------------------+

Node 2 restarted, 5 successful inserts and 5 failed

CLUSTERS (collections)
+----+-----------+----+---------+-----------------+-----+------------------+--------------------+--------------------+
|#   |NAME       |  ID|CLASS    |CONFLICT-STRATEGY|COUNT|   OWNER_SERVER   |   OTHER_SERVERS    |AUTO_DEPLOY_NEW_NODE|
+----+-----------+----+---------+-----------------+-----+------------------+--------------------+--------------------+
|5   |event      |  17|event    |                 |   11|orientdb-lab-node2|[orientdb-lab-node1]|        true        |
|6   |event_1    |  18|event    |                 |    3|orientdb-lab-node1|[orientdb-lab-node2]|        true        |
|7   |event_2    |  19|event    |                 |    2|orientdb-lab-node1|[orientdb-lab-node2]|        true        |
|8   |event_3    |  20|event    |                 |    9|orientdb-lab-node2|[orientdb-lab-node1]|        true        |
+----+-----------+----+---------+-----------------+-----+------------------+--------------------+--------------------+
|    |TOTAL      |    |         |                 |   25|                  |                    |                    |
+----+-----------+----+---------+-----------------+-----+------------------+--------------------+--------------------+

Any tip or advice appreciated. Thanks.


Solution

  • This issue has been resolved on OrientDB 2.2.13-SNAPSHOT, so should be ok in a release version very soon: https://github.com/orientechnologies/orientdb/issues/6897