Search code examples
pythonlistsublist

How to combine sub-sublists only if they share a common value?


I have sublists filled with their own sublists. If sub-sublists share a common value at index 1, then I'd like to combine the two sublists into one by merging/combining the items in the sub-sublists to create one sub-sublist.

l = [[
        ['Sublist1','AAA','10','Apple,Pear,Banana'],
        ['Sublist1','AAA','50','Peach,Orange,Banana'],
        ['Sublist1','DDD','3','Bike,Street']
    ],[
        ['Sublist2','CCC','50','Tomator,Lemmon'],
        ['Sublist2','EEE','30','Phone,Sign'],
        ['Sublist2','CCC','90','Strawberry'],
        ['Sublist2','FFF','30','Phone,Sign']
    ],[
        ['Sublist3','BBB','100','Tomator,Lemmon'],
        ['Sublist3','BBB','100','Pear'],
        ['Sublist3','FFF','90','Strawberry'],
        ['Sublist3','FFF','50','']
    ]]

For example, if the sub-sublists share AAA at index 1, combine the items at index 2 and 3. In this case 10 and 50 would become '10,50', and the 'Apple,Pear,Banana' and 'Peach,Orange,Banana' would become 'Apple,Pear,Banana,Peach,Orange,Banana'.

Desired_Result = [[
        ['Sublist1','AAA','10,50','Apple,Pear,Banana,Peach,Orange'],
        ['Sublist1','DDD','3','Bike,Street']
    ],[
        ['Sublist2','CCC','50,90','Tomator,Lemmon,Strawberry'],
        ['Sublist2','EEE','30','Phone,Sign'],
        ['Sublist2','FFF','30','Phone,Sign']
    ],[
        ['Sublist3','BBB','100,100','Tomator,Lemmon,Pear'],
        ['Sublist3','FFF','90,50','Strawberry']
    ]]

Solution

  • Can you try this?

    I assumed there was 'Sublist2' in front of 'FFF' in your sample l.

    def merge(lst):
        def j(sq):
            return ",".join(sq)
        def m(sl):
            dic = {}
            for ssl in sl:
                k = tuple(ssl[0:2])
                try:
                    v = dic[k]
                except KeyError:
                    dic[k] = v = (set(), set())
                v[0].update( set(ssl[2].split(',')) )
                v[0].discard('')
                v[1].update( set(ssl[3].split(',')) )
                v[1].discard('')
            return [ list(k) + [j(v[0])] + [j(v[1])] for k, v in sorted(dic.iteritems()) ]
        return [ m(sl) for sl in lst ]
    
    for sl in merge(l):
        print sl