Search code examples
pythonlistpython-3.xpython-itertools

How to compare two unequal lists and append matching elements value back to the first List


I have this program I'm writing where i have two unequal lists, one of the lists has other lists nested inside it so i flattened it and now i'm trying to compare the values in the two lists to find matching pairs and then append it back to the original unflattened list but the program dosen't still produce the expected result i want i've approached this problem in two different ways but i'm still arriving at the same answer here's what i've tried so far

List1  = [['A'],['b']]
List2 = ['a','A','c','A','b','b','A','b' ]

flattened = list(itertools.chain(*List1))

''' a counter i created to keep List1 going out of range and crashing
the program
'''
coun = len(flattened)
coun-=1
x = 0

for idx, i in enumerate(List2):
    if i in flattened:
        List1[x].append(List2[idx])
        if x < coun:
            x +=1

print(List1)

and this is the second approach i tried using itertools to zip the two unequal lists

import itertools

List1  = [['A'],['b']]
List2 = ['a','A','c','A','b','b','A','b' ]

flattened = list(itertools.chain(*List1))

''' a counter i created to keep List1 going out of range and crashing
the program
'''
coun = len(flattened)
coun-=1
x = 0

for idx,zipped in enumerate(itertools.zip_longest(flattened,List2)):
    result = filter(None, zipped)
    for i in result:
        if flattened[x] == List2[idx]:
            List1[x].append(List2[idx])
            if x < coun:
                x +=1

print(List1)

Both programs produce the output

[['A', 'A'], ['b', 'A', 'b', 'b', 'A', 'b']]

But I'm trying to arrive at

[['A', 'A', 'A', 'A'], ['b', 'b', 'b', 'b']]

I don't even know if i'm approaching this in the right way but i know the problem is coming from the flattened list not being the same length as List2 but i can't seem to find any way around it...by the way I'm still a newbie in Python so please try and explain your answers so i can learn from you. Thanks

EDIT: This is how i get and set the properties of the objects using the values entered by the user thought it lacks type checking now but that can be added later

class criticalPath:

    def __init__(self):
        '''
        Initialize all the variables we're going to use to calculate the critical path
        '''
        self.id = None
        self.pred = tuple()
        self.dur = None
        self.est = None
        self.lst = None
        #list to store all the objects
        self.all_objects = list()

    def create_objects(self):

        return criticalPath()

    def get_properties(self):
        ''' 
        This functions gets all the input from the user and stores the
        activity name a string, the predecessor in a tuple and the duration
        in a string
        '''
        r = criticalPath()
        Object_list = list()
        num_act = int(input('How many activities are in the project:\n'))
        for i in range(num_act):
            name = input('what is the name of the activity {}:\n'.format(i+1))
            activity_object = r.create_objects()
            pred = input('what is the predecessor(s) of the activity:\n')
            pred = tuple(pred.replace(',', ''))
            dur = input('what is the duration of the activity:\n')

            #sets the properties of the objects from what was gotten from the user
            activity_object.set_properties(name, pred, dur)
            #****
            Object_list.append(activity_object)

            self.all_objects.append(activity_object)



        return Object_list

    def set_properties(self, name, predecessor, duration):
        self.id = name
        self.pred = predecessor
        self.dur = duration

so all_objects and Object_list is a list of all the objects created


Solution

  • If you values are immutable use a collections.Counter dict to count the occurrences of elements in List2 and add occurrence * :

    List1  = [['A'],['b']]
    List2 = ['a','A','c','A','b','b','A','b' ]
    
    from collections import Counter
    
    # gets the frequency count of each element in List2
    c = Counter(List2)
    
    # create frequency + 1 objects using the value from our Counter dict
    # which is how many times it appears in List2
    print([sub * (c[sub[0]] + 1) for sub in List1])
    [['A', 'A', 'A', 'A'], ['b', 'b', 'b', 'b']]
    

    You can change the original object using [:]:

    List1[:] = [sub * c[sub[0]]+sub for sub in List1]
    

    To do it using enumerate and updating List1:

    from collections import Counter
    
    c = Counter(List2)
    
    from copy import deepcopy
    
    # iterate over a copy of List1
    for ind, ele in enumerate(deepcopy(List1)):
        # iterate over the sub elements
        for sub_ele in ele:
            # keep original object and add objects * frequency new objects
            List1[ind].extend([sub_ele] * c[sub_ele])
    
    print(List1)
    

    If you have mutable values you will need to make copies or create new objects in the generator expression depending on how they are created:

    from copy import deepcopy
    for ind, ele in enumerate(deepcopy(List1)):
        for sub_ele in ele:
            List1[ind].extend(deepcopy(sub_ele) for _ in range(c[sub_ele]))
    
    print(List1)
    

    There is no need to check for objects as objects not in List2 will have a value of 0 so 0 * object == no object added.

    Based on the edit you can either check every node against every node or use a dict grouping common nodes:

    Checking every node:

    from copy import deepcopy
    
    for ind, st_nodes in enumerate(starting_nodes):
        for node in object_list:
            if st_nodes[0].id == node.pred:
                starting_nodes[ind].append(deepcopy(node))
    print(starting_nodes)
    

    using a dict grouping all nodes by the pred attribute:

    from copy import deepcopy
    from collections import defaultdict
    
    nodes = defaultdict(list)
    for node in object_list:
        nodes[node.pred].append(node)
    
    for ind, st_nodes in enumerate(starting_nodes):
        starting_nodes[ind].extend(deepcopy(nodes.get(st_nodes[0].id,[])))
    

    For larger input the dict option should be more efficient.