I am using itertools.tee
for making copies of generators which yield dictionaries and pass the iterated dictionaries to functions that I don't have control about and that may modify the dictionaries. Thus, I would like to pass copies of the dictionaries to the functions, but all the tees yield just references to the same instance.
This is illustrated by the following simple example:
import itertools
original_list = [{'a':0,'b':1}, {'a':1,'b':2}]
tee1, tee2 = itertools.tee(original_list, 2)
for d1, d2 in zip(tee1, tee2):
d1['a'] += 1
print(d1)
d2['a'] -= 1
print(d2)
The output is:
{'b': 1, 'a': 1}
{'b': 1, 'a': 0}
{'b': 2, 'a': 2}
{'b': 2, 'a': 1}
While I would like to have:
{'b': 1, 'a': 1}
{'b': 1, 'a': -1}
{'b': 2, 'a': 2}
{'b': 2, 'a': 0}
Of course, in this example there would be many ways to work around this easily, but due to my specific use case, I need a version of itertools.tee
that stores copies of all iterated objects in the queues of the tees instead of references to the original.
Is there a straightforward way to do this in Python or would I have to re-implement itertools.tee
in a non-native and, hence, inefficient way?
There is no need to rework tee
. Just wrap each generator produced by tee
in a map(dict, ...)
generator:
try:
# use iterative map from Python 3 if this is Python 2
from future_builtins import map
except ImportError:
pass
tee1, tee2 = itertools.tee(original_list, 2)
tee1, tee2 = map(dict, tee1), map(dict, tee2)
This automatically produces a shallow copy of each dictionary as you iterate.
Demo (using Python 3.6):
>>> import itertools
>>> original_list = [{'a':0,'b':1}, {'a':1,'b':2}]
>>> tee1, tee2 = itertools.tee(original_list, 2)
>>> tee1, tee2 = map(dict, tee1), map(dict, tee2)
>>> for d1, d2 in zip(tee1, tee2):
... d1['a'] += 1
... print(d1)
... d2['a'] -= 1
... print(d2)
...
{'a': 1, 'b': 1}
{'a': -1, 'b': 1}
{'a': 2, 'b': 2}
{'a': 0, 'b': 2}