Search code examples
consistencyaerospike

Consistency issue with Aerospike


I am using in Aerospike (v3.5.12) as in memory key value store of one node.

Also usign Java client(v3.1.7) to retrieve and write data.

I notice that under certain QPS (3K) of reads(Batch) and writes.

Some of the data retrieval doesn't work.

[INFO] [24/11/2015 14:43:38.782] Write Data
[ERROR] [24/11/2015 14:43:38.937] Read Data - Not Found

Did someone encountered with similiar issue ?

UPDATE

I updated to Aerospike 3.7.1.

I am using the async oparations in the following matter:

    public void store() {
        WritePolicy expirationWritePolicy = new WritePolicy();
        expirationWritePolicy.sendKey = true;
        expirationWritePolicy.priority = Priority.HIGH;
        expirationWritePolicy.expiration = 10;

        Key key = new Key(namespace, SET_NAME, requestId);
        Bin bin = new Bin(BIN_NAME, serializer.toBinary(budgetCommit));
        Bin extra = new Bin("extra", "data");

        client.put(expirationWritePolicy, new WriteListener() {
            @Override
            public void onSuccess(Key key) {
                logger.info("Succeed to store {}", requestId());
            }

            @Override
            public void onFailure(AerospikeException exception) {
                logger.error(exception, "Fail to store {}", key);
            }
        }, key, extra, bin);
    }

  public void retrieve() {
        WritePolicy defaultWritePolicy = new WritePolicy();
        defaultWritePolicy.priority = Priority.LOW;
        defaultWritePolicy.sendKey = true;

        Key key = new Key(namespace, SET_NAME, requestId);
        Bin closeExtra = new Bin("extra", "_closed");

        client.operate(defaultWritePolicy, new RecordListener() {
                    @Override
                    public void onSuccess(Key key, Record record) {
                        if (record.getValue(BIN_NAME) == null){
                               logger.error("Fail to retrieve {}", requestId);
                         }
                    }

                    @Override
                    public void onFailure(AerospikeException exception) {
                        logger.error("Fail to retrieve {} : {}", requestId, exception.getMessage());
                    }
                }, key,
                Operation.append(closeExtra), Operation.get());
    }

[INFO] [12/01/2016 08:37:16.732] Succeed to store 379e67dc-945d-4717-97a7-721cc8093c05 [ERROR] [12/01/2016 08:37:16.736] Fail to retrieve 379e67dc-945d-4717-97a7-721cc8093c05

The onSuccess callback is called when there is an Ack from the Aerospike.

Starting to fail around 8k QPS on master write.


Solution

  • There is probably a rebalance round happening. The batch get in that version would not proxy to other nodes.

    Aerospike Server 3.6.4 has a new batch implementation that can proxy.

    http://www.aerospike.com/download/server/notes.html#3.6.4