Search code examples
google-app-enginegoogle-cloud-datastoreapp-engine-ndb

Google App Engine - upgrading existing NDB property to be a repeated structured property


I have a production application running in GAE, with tuns of data in NDB.

I had a properties in one of my models that was never being used but was added for over 2 years now for "future proofing", the issue is that the property was declared like this :

notes = ndb.TextProperty()

So all my current models have "notes: None" as they where never populated.

I would like to now change this to be a repeated structured property like this :

class Note(Model):
    created_by = ndb.StringProperty()
    text = ndb.TextProperty()

....

notes = ndb.StructuredProperty(Note, repeated=True)

When making this change I get the following error :

RuntimeError: StructuredProperty notes expected to find properties separated by periods at a depth of 1; received ['notes']

Makes sense, and the main issue is that i'm changing it from a none repeated to a repeated property ( If I change it to be a single instance of Model 'Note' there is no error, as None can be passed into a none repeated property )

I dont really want to make a new param, as the name notes is perfect... The best solution I have found so far is : https://cloud.google.com/appengine/articles/update_schema

However seeing as I have literally no valid data in the property it seems like a big spend for me to have to migrate +- 900 000 entities to remove a field that has None ....

I have even thought about extending the _deserialize method inside "platform/google_appengine/google/appengine/ext/ndb/model.py" as I can see where it is throwing the exception based on the value being None and not [], however that DOES NOT seem like a good idea, or something Google would advise me to be doing.

The holy grail in my mind would be something like this :

notes = ndb.StructuredProperty(Note, repeated=True, default=[])

or

notes = ndb.StructuredProperty(Note, repeated=True, ignoreNone=True)

That would rather make this property set to the default i.e [] on _deserialize failure instead of throwing a 500 and killing my application.

Thanks!


Solution

  • You've got a couple options, you could make a wrapper object around Note like:

    notes = ndb.StructuredProperty(Notes)
    
    class Notes(ndb.Model):
      notes = ndb.StructuredProperty(Note, repeated=True)
    

    You could also use a different name in Datastore, e.g.

    notes = ndb.StructuredProperty(Note, name='real_notes', repeated=True)