google-app-engine eventual-consistency idempotent

Idempotent client-retryable entry creation in AppEngine given eventually consistent queries, etc

I need to come up with a strategy for handling client-retries on a data-store entry creation:

Client sends request to create new entry in database
Server performs entry creation and prepares success-reply
Some error happens that makes the client believe that the request wasn't processed (packet loss, ...)
Client sends same request to create new entry in database again
Server detects retry and recreates and sends original reply without creating another data-store entry
Client receives reply
Everyone is happy and only ONE entry was created in database

I have one restriction: The server is STATELESS! It has no kind of session-information on the client.

My current idea is the following:

Tag every create-request with a guaranteed globally unique ID (here's how I create them, although they are not too relevant for the question):
- Using the data-store (and memcache), I assign a unique, monotonically increasing ID to every server instance once it loads (let's call it SI)
- When a client requests the starting-page, the instance that served the request generates a unique monotonically increasing page-load-id (PL) and sends SI.PL to the client along with the page content
- For every create-request, the client generates a unique monotonically increasing request-id (RI) and sends SI.PL.RI along with the create-request
For every create-request, the server first checks whether it knows the create-tag
If not, it creates the new entry and somehow stores the create-tag along with it
If it does know the tag, it uses it to find the originally created entry and recreates a corresponding reply

Here are the implementation options that I am thinking about right now and their problems:

Store the create-tag as an indexed property inside the entry:
- When the server gets a request, it has to use a query to find any existing entry
- Problem: Since queries in AppEngine are only eventually consistent, it might miss an entry
Use the create-tag as the entry's key:
- Should be ok as it is guaranteed to be unique if the numbers don't wrap (unlikely with longs)
- Minor inconvenience: It increases the length of the entries' keys in any future use (unneeded overhead)
- Major problem: This will generate sequential entry keys in the datastore which should be avoided at all cost as it creates hot-spots in the stored data and thus can impact performance significantly

One solution I am contemplating for option 2 is to use some sort of formula that takes the sequential numbers and re-maps them onto a unique, deterministic, but random-looking sequence instead to eliminate hot-spots. Any ideas on what such a formula could look like?

Or maybe there is a better approach altogether?

Solution

While it is possible to make implementation option (1) from above work, by using a cleverly designed data-structure, the right transactions, and garbage-collection, it'll be quite a pain to get it to work reliably.

So, it seems like the right solution is to go with a generalized version of option (2) instead:

Use some unique identifier for the created entity as the entity's key.

That way, if the same entity is created again in a retry, it is either easy to reliably find an existing copy (as gets are strongly consistent), or blindly writing it again will simply overwrite the first version.

@Greg suggested in the comments to use a hash over the entity's uniquely identifying data as the key. While this solves the problem of having keys that are evenly distributed across parameter-space and thus leads to an efficient distribution of the data across physical storage locations, it does create the new problem of having to manage (or ignore) hash-collisions, especially if one tries to keep keys from getting very long.

There are ways to handle these collisions. For example: In case of a collision, compare the actual content to check whether it truly is a duplicate, and, if not, add a "1" to the key. Then, see if that key exists also, and, if so, check again if it has the same content. If not, add a "2" instead, check again for a collision, and so on... While this works, it gets quite messy.

Or you can just say that hash collisions are so rare that one will never have enough user data in one's database to ever see one. I personally don't like these sorts of "keep-your-fingers-crossed"-approaches, but in many cases, it might be an acceptable way to go.

But, luckily I already have a collision-free globally unique identifier for the data: the create-tag. And it turns out the two issues that I saw with using it are both easily remedied through some clever bit-shuffling:

Using the same identifiers as in the original question, my create-tag SI.PL.RI consists of SI, which will keep increasing forever, PL, which resets to 0 every time a new server instance is created, and RI which resets for every new client session. So RI is likely always tiny, PL will stay somewhat small, and SI will slowly get huge.

Given that, I could for example build the key like this (starting with the most significant bits):

- Lowest 10 bits of PL
- Lowest  4 bits of RI
- Lowest 17 bits of SI
- 1 bit indicating whether there are any further non-zero values
- Next lowest 10 bits of PL
- Next lowest  4 bits of RI
- Next lowest 17 bits of SI
- 1 bit indicating whether there are any further non-zero values
- ... until ALL bits of RI, PL, and SI are used (eventually breaking 10-4-17 pattern)

That way, the generated keys are spread nicely across parameter space if sorted in lexical order (as AppEngine does), and, the first keys are only half as long as the auto-generated ones, and they only get longer as needed.

Aside 1:

In fact, if no server instance is ever alive for long enough to serve more than a thousand page-loads and no one client ever creates more than 16 new entities in one session, and server instances are not spawned faster than one every 5 minutes on average, it will take more than a year before keys get longer than 4 bytes on average.

And if no server instance is ever alive for long enough to serve more than a million page-loads and no one client ever creates more than 256 new entities in one session, and server instances are not spawned faster than one every second on average, it will take still take over 500 years before keys get longer than 8 bytes (and thus longer than auto-generated ones) on average. Should be fine... :)

Aside 2:

If I need to use the keys to index a Java HashMap instead, the hashCode() function of my key-object can instead return an integer built from the first 4 key-bytes in reverse order to spread the keys across buckets.