I've been trying to create entities with a property which should be unique or None something similar to:
class Thing(ndb.Model):
something = ndb.StringProperty()
unique_value = ndb.StringProperty()
Since ndb has no way to specify that a property should be unique it is only natural that I do this manually like this:
def validate_unique(the_thing):
if the_thing.unique_value and Thing.query(Thing.unique_value == the_thing.unique_value).get():
raise NotUniqueException
This works like a charm until I want to do this in an ndb transaction which I use for creating/updating entities. Like:
@ndb.transactional
def create(the_thing):
validate_unique(the_thing)
the_thing.put()
However ndb seems to only allow ancestor queries, the problem is my model does not have an ancestor/parent. I could do the following to prevent this error from popping up:
@ndb.non_transactional
def validate_unique(the_thing):
...
This feels a bit out of place, declaring something to be a transaction and then having one (important) part being done outside of the transaction. I'd like to know if this is the way to go or if there is a (better) alternative.
Also some explanation as to why ndb only allows ancestor queries would be nice.
Since your uniqueness check involves a (global) query it means it's subject to the datastore's eventual consistency, meaning it won't work as the query might not detect freshly created entities.
One option would be to switch to an ancestor query, if your expected usage allows you to use such data architecture, (or some other strongly consistent method) - more details in the same article.
Another option is to use an additional piece of data as a temporary cache, in which you'd store a list of all newly created entities for "a while" (giving them ample time to become visible in the global query) which you'd check in validate_unique()
in addition to those from the query result. This would allow you to make the query outside the transaction and only enter the transaction if uniqueness is still possible, but the ultimate result is the manual check of the cache, inside the transaction (i.e. no query inside the transaction).
A 3rd option exists (with some extra storage consumption as the price), based on the datastore's enforcement of unique entity IDs for a certain entity model with the same parent (or no parent at all). You could have a model like this:
class Unique(ndb.Model): # will use the unique values as specified entity IDs!
something = ndb.BooleanProperty(default=False)
which you'd use like this (the example uses a Unique parent key, which allows re-using the model for multiple properties with unique values, you can drop the parent altogether if you don't need it):
@ndb.transactional
def create(the_thing):
if the_thing.unique_value:
parent_key = get_unique_parent_key()
exists = Unique.get_by_id(the_thing.unique_value, parent=parent_key)
if exists:
raise NotUniqueException
Unique(id=the_thing.unique_value, parent=parent_key).put()
the_thing.put()
def get_unique_parent_key():
parent_id = 'the_thing_unique_value'
parent_key = memcache.get(parent_id)
if not parent_key:
parent = Unique.get_by_id(parent_id)
if not parent:
parent = Unique(id=parent_id)
parent.put()
parent_key = parent.key
memcache.set(parent_id, parent_key)
return parent_key