Search code examples
pythonpython-3.xalgorithmlcs

Find sublists with common starting elements - python


I have a nested list:

lists =[['a','b','c'],
        ['a','b','d'],
        ['a','b','e'],
        ['с','с','с','с']]

I need to find sublists with 2 or more common (2 or more occurrences) first elements, make a single string from this elements, and make a single string from sublists, which does not contain common first elements. Sublists can go in different order, so just checking next or previous element is the wrong way, I suppose. Desired output:

   [['a b','c'],
    ['a b','d'],
    ['a b','e'],
    ['с с с с']]

I tried some for loops, but with no success. I currently don't know, where to start, so any help would be appreciated. Thank you for your time!


Solution

  • Probably not the most efficient way, but you could try something like this:

    def foo(l,n):
        #Get all of the starting sequences
        first_n = [list(x) for x in set([tuple(x[:n]) for x in l])]
    
        #Figure out which of those starting sequences are duplicated
        duplicates = []
        for starting_sequence in first_n:
            if len([x for x in l if x[:n] == starting_sequence])>2:
                duplicates.append(starting_sequence)
    
        #make changes
        result = []
        for x in l:
            if x[:n] in duplicates:
                result.append([" ".join(x[:n])]+x[n:])
            else:
                result.append([" ".join(x)])
    
        return result
    

    Set's have no repeats, but elements of sets must be hashable. Since lists are unhashable, that is why I have converted them into tuples and then back into lists.