More effective loop in Python

I have situation where I need to loop over two lists of objects and find equal and then loop over their fields and change some attributes. Looks it like this

for new_product in products_and_articles['products']:
  for old_product in products_for_update:
    if new_product.article == old_product.article:
      for old_field in old_product._meta.get_all_field_names():
        for new_field in new_product._meta.get_all_field_names():
          if old_field == new_field and old_field != 'id' and old_field != 'slug':
            setattr(old_product, old_field, getattr(new_product, old_field))

Obviously this is far from being good or even acceptable. So I'm seeking for advice how can I avoid so much looping and enhance the algorythm

Solution

It helps if you break the process into logical, reusable parts.

for new_product in products_and_articles['products']:
  for old_product in products_for_update:
    if new_product.article == old_product.article:
      …

For example, here what you are doing is find a product that matches a particular article. Since article is unique, we could write something like this:

def find_products_by_article(products, article):
  '''Find all products that match the given article.  Returns
  either a product or 'None' if it doesn't exist.'''
  for products in products:
    return product

Then call it with:

for old_product in products_for_update:
  new_products = find_products_by_article(
                   products_and_articles['products'],
                   old_product.article)
  …

But this could be much more efficient if we could take advantage of a data structure that is optimized for look-ups, namely a dict (constant instead of linear complexity). So what we could do instead is:

# build a dictionary that stores products indexed by article
products_by_article = dict(product.article, product for product in
                           products_and_articles['products'])

for old_product in products_for_update:
  try:
    # look up product in the dictionary
    new_product = products_by_article[old_product.article]
  except KeyError:
    # silently ignore products that don't exist
    continue
  …

If you do such lookups frequently, it would be better to reuse the products_by_article dictionary elsewhere as well instead of building one from scratch every time. Be careful though: if you use multiple representations of the product records, you need make they always stay in sync!

For the inner loops, notice that new_field here only serves as a check for whether a field exists:

…
  for old_field in old_product._meta.get_all_field_names():
    for new_field in new_product._meta.get_all_field_names():
      if old_field == new_field and old_field != 'id' and old_field != 'slug':
        setattr(old_product, old_field, getattr(new_product, old_field))

(Note that this is slightly suspicious: any new fields that don't already exist in the old_product are silently discarded: is this intentional?)

This can be repackaged as follows:

def transfer_fields(old, new, exclusions=('id', 'slug')):
  '''Update all pre-existing fields in the old record to have
  the same values as the new record.  The 'exclusions' parameter
  can be used to exclude certain fields from being updated.'''
  # use a set here for efficiency reasons
  fields = frozenset(old._meta.get_all_field_names())
  fields.difference_update(new._meta.get_all_field_names())
  fields.difference_update(exclusions)
  for field in fields:
    setattr(old, field, getattr(new, field))

Putting all this together:

# dictionary of products indexed by article
products_by_article = dict(product.article, product for product in
                           products_and_articles['products'])

for old_product in products_for_update:
  try:
    new_product = products_by_article[old_product.article]
  except KeyError:
    continue          # ignore non-existent products
  transfer_fields(old_product, new_product)

This final code has a time complexity of O(n × k), where n is the number of products and k is the number of fields.