I have situation where I need to loop over two lists of objects and find equal and then loop over their fields and change some attributes. Looks it like this
for new_product in products_and_articles['products']:
for old_product in products_for_update:
if new_product.article == old_product.article:
for old_field in old_product._meta.get_all_field_names():
for new_field in new_product._meta.get_all_field_names():
if old_field == new_field and old_field != 'id' and old_field != 'slug':
setattr(old_product, old_field, getattr(new_product, old_field))
Obviously this is far from being good or even acceptable. So I'm seeking for advice how can I avoid so much looping and enhance the algorythm
It helps if you break the process into logical, reusable parts.
for new_product in products_and_articles['products']:
for old_product in products_for_update:
if new_product.article == old_product.article:
…
For example, here what you are doing is find a product that matches a particular article
. Since article
is unique, we could write something like this:
def find_products_by_article(products, article):
'''Find all products that match the given article. Returns
either a product or 'None' if it doesn't exist.'''
for products in products:
return product
Then call it with:
for old_product in products_for_update:
new_products = find_products_by_article(
products_and_articles['products'],
old_product.article)
…
But this could be much more efficient if we could take advantage of a data structure that is optimized for look-ups, namely a dict
(constant instead of linear complexity). So what we could do instead is:
# build a dictionary that stores products indexed by article
products_by_article = dict(product.article, product for product in
products_and_articles['products'])
for old_product in products_for_update:
try:
# look up product in the dictionary
new_product = products_by_article[old_product.article]
except KeyError:
# silently ignore products that don't exist
continue
…
If you do such lookups frequently, it would be better to reuse the products_by_article
dictionary elsewhere as well instead of building one from scratch every time. Be careful though: if you use multiple representations of the product records, you need make they always stay in sync!
For the inner loops, notice that new_field
here only serves as a check for whether a field exists:
…
for old_field in old_product._meta.get_all_field_names():
for new_field in new_product._meta.get_all_field_names():
if old_field == new_field and old_field != 'id' and old_field != 'slug':
setattr(old_product, old_field, getattr(new_product, old_field))
(Note that this is slightly suspicious: any new fields that don't already exist in the old_product
are silently discarded: is this intentional?)
This can be repackaged as follows:
def transfer_fields(old, new, exclusions=('id', 'slug')):
'''Update all pre-existing fields in the old record to have
the same values as the new record. The 'exclusions' parameter
can be used to exclude certain fields from being updated.'''
# use a set here for efficiency reasons
fields = frozenset(old._meta.get_all_field_names())
fields.difference_update(new._meta.get_all_field_names())
fields.difference_update(exclusions)
for field in fields:
setattr(old, field, getattr(new, field))
Putting all this together:
# dictionary of products indexed by article
products_by_article = dict(product.article, product for product in
products_and_articles['products'])
for old_product in products_for_update:
try:
new_product = products_by_article[old_product.article]
except KeyError:
continue # ignore non-existent products
transfer_fields(old_product, new_product)
This final code has a time complexity of O(n × k)
, where n
is the number of products and k
is the number of fields.