I am running Firestore in Datastore mode and have a structure in one of my tables where I use an array of items in each entity. I store the keys of these items in that array.
In one of my (cloud run) services I write a new entity (another kind) and add the key of these items to this array (on an existing entity). This write can happen from several places concurrently. I am using "allocate_ids" to pre-allocate the keys...
However sometimes one of my writes got over-written, even though I am using transactions; my code is as follows:
# key = the key of the existing entity to which I want to add data
# data = the data to put into the new item
# We pre-allocate the keys (so we know what to store in the array)
itemkeys = client.allocate_ids(client.key("item"), 1)
itemkey = itemkeys[0]
trans = client.transaction()
trans.begin()
# First handle item
print(f"Writing to item key: {itemkey.id}")
item = datastore.Entity(itemkey)
item.update(data)
# Then entity
print(f"Adding to entity : {key.id}")
entity = client.get(key)
print(f"Current length: {len(entity['array'])}")
entity["array"].append(itemkey)
print(f"New length: {len(entity['array'])}")
# Write the items
trans.put(item)
trans.put(entity)
# Commit and pray
trans.commit()
It looks like sometimes the entity is taken from cache, even though I have put the "get" inside the transaction...
I have logged this information (it's running in a Cloud Run container), and I see the following:
Can anyone provide some options/insights in what is going on? Am I trying something that is not supported?
N.b. I checked and the database is currently using "Pessimistic" concurrency.
Ok, so after some serious searching I found another microservice that was updating this entity.
The problem was that there sometimes there was a lot of time (multiple seconds) between the retrieval of the entitiy and the adjusting of a single attribute and writing the entity.
The take-away here is that, when you update an entity, you must retrieve the entity just before you update the attributes and write it back.
I.e. in pseudo-code:
with data_client.transaction():
entity = dataclient.get(key)
entity["attribute"] = value
dataclient.put(entity)