So let say I have this class:
class Spam(object):
def __init__(self, a):
self.a = a
And now I have these objects:
s1 = Spam((1, 1, 1, 4))
s2 = Spam((1, 2, 1, 4))
s3 = Spam((1, 2, 1, 4))
s4 = Spam((2, 2, 1, 4))
s5 = Spam((2, 1, 1, 8))
s6 = Spam((2, 1, 1, 8))
objects = [s1, s2, s3, s4, s5, s6]
so after running some kind of method, I need to have two lists that have objects that had same a
attribute value in one list and the other objects that had unique a
attribute.
Like this:
dups = [s2, s3, s5, s6]
normal = [s1, s4]
So it is something like getting duplicates, but in addition it should also add even first occurrence of object that shares same a
attribute value.
I have written this method and it seems to be working, but it is quite ugly in my opinion (and probably not very optimal).
def eggs(objects):
vals = []
dups = []
normal = []
for obj in objects:
if obj.a in vals:
dups.append(obj)
else:
normal.append(obj)
vals.append(obj.a)
dups_vals = [o.a for o in dups]
# separate again
new_normal = []
for n in normal:
if n.a in dups_vals:
dups.append(n)
else:
new_normal.append(n)
return dups, new_normal
Can anyone write more appropriate pythonic approach for such problem?
I would group together the objects in a dictionary, using the a
attribute as the key. Then I would separate them by the size of the groups.
import collections
def separate_dupes(seq, key_func):
d = collections.defaultdict(list)
for item in seq:
d[key_func(item)].append(item)
dupes = [item for v in d.values() for item in v if len(v) > 1]
uniques = [item for v in d.values() for item in v if len(v) == 1]
return dupes, uniques
class Spam(object):
def __init__(self, a):
self.a = a
#this method is not necessary for the solution, just for displaying the results nicely
def __repr__(self):
return "Spam({})".format(self.a)
s1 = Spam((1, 1, 1, 4))
s2 = Spam((1, 2, 1, 4))
s3 = Spam((1, 2, 1, 4))
s4 = Spam((2, 2, 1, 4))
s5 = Spam((2, 1, 1, 8))
s6 = Spam((2, 1, 1, 8))
objects = [s1, s2, s3, s4, s5, s6]
dupes, uniques = separate_dupes(objects, lambda item: item.a)
print(dupes)
print(uniques)
Result:
[Spam((2, 1, 1, 8)), Spam((2, 1, 1, 8)), Spam((1, 2, 1, 4)), Spam((1, 2, 1, 4))]
[Spam((1, 1, 1, 4)), Spam((2, 2, 1, 4))]