Let's assume the following Peewee model:
class TestModel(Model):
id = IntegerField(primary_key=True)
name = CharField()
text = TextField()
class Meta:
database = db
If we have a populated database with instances of this model, and select them all, we get the following results:
>>> old_models = [m for m in TestModel.select()]
>>> # old_models contains two TestModels stored in DB:
>>> # TestModel(id=1, name='red', text='hello')
>>> # TestModel(id=5, name='blue', text='world')
Now, from an external source, we get a list of data that we convert into our models:
>>> new_models = []
>>> new_models.append(TestModel(id=1, name='red', text='hello'))
>>> new_models.append(TestModel(id=5, name='red', text='NOT WORLD'))
>>> new_models.append(TestModel(id=10, name='green', text='hello world'))
Getting newly-added models (i.e. not present in the DB) and those that have been added is easy:
>>> added_models = [m for m in new_models if m not in old_models]
>>> # added_models with contain TestModel with ID 10
What's the most efficient way to find those models that have been updated though? In our case, the model with ID 5. Overwriting the existing models with the newly-retrieved data won't work because a dirty field is every field that is touched. And even if we overwrite a value if it does not equal, then we lose the ability to compare both values (new and old). Any ideas?
I don't think there's any way to do this with the Model API, but there is a way to do it if you are willing to depend on an implementation detail of peewee Models. Also it probably won't be very fast at scale.
There is a dict representing the Model's data in m.__dict__['_data']
that you can use.
First, get a dict
of old_models
by id:
old_models_by_id = {m.get_id(): m for m in old_models}
Then, write a simple function comparing two Model's data
def compare_models(m1, m2):
"""Return True if two models have exactly the same data"""
return m1.__dict__['_data'] == m2.__dict__['_data']
Finally, get updated models:
updated_models = [m for m in updated_models
if m.get_id() in old_models_by_id
and not compare_models(m, old_models_by_id.get(m.get_id()))]