Search code examples
pythonpython-2.7python-itertools

Changing elements within a groupby


I'm grouping rows of data together based on a key, and if any of the rows in that group have "R" in the status field, then they should all have that field changed to "R".

Here's the code I've tried:

from operator import itemgetter
from itertools import groupby

headers = data.pop(0)
Col = {headers[i].strip().upper():i for i in range(len(headers))}

data = sorted(data, key=itemgetter(Col["KEY_FIELD"]))
for key,group in groupby(data, lambda x: x[Col["KEY_FIELD"]]):
  for item in group: 
    if any([item[Col["STATUS"]]=="R" for item in group]):
      item[Col["STATUS"]] = "R"

However this doesn't seem to change anything in the data. Is there a pythonic way to change the original data variable for each group based on this criteria, or do I need to create a new list and copy the data into it after iterating over each group?


Solution

  • group is an iterator, you cannot loop over it twice like that. Convert the group to a list first, and test just once:

    group_key = itemgetter(Col["KEY_FIELD"])
    data = sorted(data, key=group_key)
    
    for key, group in groupby(data, group_key):
        group = list(group)
        status_r = any(item[Col["STATUS"]] == "R" for item in group)
        for item in group: 
            if status_r:
               item[Col["STATUS"]] = "R"
    

    You probably want to invert the for loop and if test there; there is little point in looping over the group again if you only need to do so if the status_r condition has been met:

    for key, group in groupby(data, group_key):
        group = list(group)
        if any(item[Col["STATUS"]] == "R" for item in group):
            for item in group: 
               item[Col["STATUS"]] = "R"