Search code examples
google-app-enginedefault-valuedatastore

Most efficient way to iterate through entire datastore and set a default value to a modified schema?


I have an existing schema:

class Example (db.Model) :
 row_num = db.IntegerProperty(required=True)
 updated = db.IntegerProperty()
 ...
 ...

I have now updated this to :

class Example (db.Model) :
 row_num = db.IntegerProperty(required=True)
 updated = db.IntegerProperty(default=0)
 ...
 ...

However, there are over 2 million entities in the Datastore which do not have update = 0 set by default.

What is the easiest way to do this? Can this by done by a single command from the admin terminal?


Solution

  • You'll need to write a script that iterates through the objects, grabbing them (up to 1000 at a time), updating their property value, and then saving them back.

    No, this is not really efficient relative to a standard SQL DB doing the same kind of thing (where you could just issue a single UPDATE), but BigTable (the backing technology behind the GAE Datastore) is not a SQL relational database - it's an entirely different architecture designed to be good at different things and not optimized for updating a single field across millions of rows at a time - hence why GQL syntax has no notion of an UPDATE statement.

    Edit:

    As David kindly pointed out in comments, Google recently released the Mapper API which can be used to assist with this.