Search code examples
pythoncassandranosqlamazon-keyspacesdatastax-python-driver

Is it possible to get a callback from Cassandra after the INSERT operation that the record was created successfully?


I have encountered very strange behavior and am trying to understand in which cases this may occur. In my Python application, I access the Cassandra database via the driver.

As you can see below, first, I do an INSERT operation, which creates a record in the table. Next, I do a SELECT operation that should return the last message that was created earlier. Sometimes the select operation returns empty values to me. I have an assumption that Cassandra has an internal scheduler that takes the INSERT task to work. However, when I try to get the last record through the SELECT operation, the record has not yet been created. Is this possible?

QUESTION:

Is it possible to get a callback from Cassandra after the INSERT operation that the record was created successfully?

SNIPPET:

import uuid
import sys
from cassandra import ConsistencyLevel
from cassandra.query import SimpleStatement, dict_factory


def send_message(chat_room_id, message_author_id, message_text):
    message_id = uuid.uuid1()


    first_query = """
    insert into messages (
        created_date_time,
        chat_room_id,
        message_id,
        message_author_id,
        message_text
    ) values (
        toTimestamp(now()),
        {0},
        {1},
        {2},
        {3}
    );
    """.format(
        chat_room_id,
        message_id,
        message_author_id,
        message_text
    )

    first_statement = SimpleStatement(
        first_query,
        consistency_level=ConsistencyLevel.LOCAL_QUORUM
    )

    try:
        db_connection.execute(first_statement)
    except Exception as error:
        logger.error(error)
        sys.exit(1)

    db_connection.row_factory = dict_factory

    second_query = """
    select
        created_date_time,
        chat_room_id,
        message_id,
        message_author_id,
        message_text
    from
        messages
    where
        chat_room_id = {0}
    and
        message_id = {1}
    limit 1;
    """.format(
        chat_room_id,
        message_id
    )

    try:
        message = db_connection.execute(second_query).one()
    except Exception as error:
        logger.error(error)
        sys.exit(1)

    print(message) # Sometimes when it's the first message in the chat room I see a "None" value.

Solution

  • When you execute the first insert statement and you get the result, that means Cassandra completed your insert statement.

    It looks like you are inserting with consistency level(CL) of LOCAL_QUORUM but CL is not set when you select the same record.

    By default, python driver uses LOCAL_ONE for consistency level if it is not set.

    https://docs.datastax.com/en/developer/python-driver/3.24/getting_started/#setting-a-consistency-level

    In your case, when you insert the record with LOCAL_QUORUM, assuming you have replication factor of 3, then at least 2 replica nodes out of 3 have your data.

    (note that Cassandra always tries to write to all the replica nodes.)

    And then you query with LOCAL_ONE, you may hit those 2 nodes and get the result, or you may hit the one that failed to write your record.

    In order to achieve strong consistency in Cassandra, you have to use LOCAL_QUORUM for reads and writes.

    Try using LOCAL_QUORUM for select also, or set the default consistency level to LOCAL_QUORUM through default execution profile: https://docs.datastax.com/en/developer/python-driver/3.24/getting_started/#execution-profiles