Search code examples
pythongoogle-cloud-firestoredatastore

Transactions and array type inconsistencies


I am running Firestore in Datastore mode and have a structure in one of my tables where I use an array of items in each entity. I store the keys of these items in that array.

In one of my (cloud run) services I write a new entity (another kind) and add the key of these items to this array (on an existing entity). This write can happen from several places concurrently. I am using "allocate_ids" to pre-allocate the keys...

However sometimes one of my writes got over-written, even though I am using transactions; my code is as follows:

    # key = the key of the existing entity to which I want to add data
    # data = the data to put into the new item

    # We pre-allocate the keys (so we know what to store in the array)
    itemkeys = client.allocate_ids(client.key("item"), 1)
    itemkey = itemkeys[0]

    trans = client.transaction()
    trans.begin()

    # First handle item
    print(f"Writing to item key: {itemkey.id}")
    item = datastore.Entity(itemkey)
    item.update(data)

    # Then entity
    print(f"Adding to entity : {key.id}")
    entity = client.get(key)
    print(f"Current length: {len(entity['array'])}")
    entity["array"].append(itemkey)
    print(f"New length: {len(entity['array'])}")

    # Write the items
    trans.put(item)
    trans.put(entity)

    # Commit and pray
    trans.commit()

It looks like sometimes the entity is taken from cache, even though I have put the "get" inside the transaction...

I have logged this information (it's running in a Cloud Run container), and I see the following:

  • 2023-03-20T10:03:37.668646Z Writing to item key: 4660848840671232
  • 2023-03-20T10:03:37.668662Z Adding entity: 5508678350274560
  • 2023-03-20T10:03:37.697575Z Current length: 0
  • 2023-03-20T10:03:37.697590Z New length: 1
  • ..some time passes
  • 2023-03-20T10:03:51.020477Z Writing to item key: 6436988542517248
  • 2023-03-20T10:03:51.020508Z Adding entity: 5508678350274560
  • 2023-03-20T10:03:51.058243Z Current length: 0
  • 2023-03-20T10:03:51.058253Z New length: 1

Can anyone provide some options/insights in what is going on? Am I trying something that is not supported?

N.b. I checked and the database is currently using "Pessimistic" concurrency.


Solution

  • Ok, so after some serious searching I found another microservice that was updating this entity.

    The problem was that there sometimes there was a lot of time (multiple seconds) between the retrieval of the entitiy and the adjusting of a single attribute and writing the entity.

    The take-away here is that, when you update an entity, you must retrieve the entity just before you update the attributes and write it back.

    I.e. in pseudo-code:

        with data_client.transaction():
          entity = dataclient.get(key)
          entity["attribute"] = value
          dataclient.put(entity)