Search code examples
pythonalgorithmchunking

Flocculating data in Python


I am struggling to find the error in my flocculation function.

The goal of the function is to take a list and chunk each group of contiguous values into a single value. For example...

[1, 4, 4, 2, 0, 3, 3, 3] => [1, 4, 2, 0, 3]

The function as it stands now is...

def flocculate(array):
    for index1, val1 in enumerate(array):
        if val1 == 0 or not not val1:
            new_array = array[index1+1:]
            for index2, val2 in enumerate(new_array):
                if array[index1] == val2:
                    array[index1 + index2 + 1] = False
                else:
                    break
    return [value for value in array if type(value) is not bool]

However, it doesn't seem to handle zeros very well.

For example, the input shown below gets some of the zeros correct, but misses some others...

[2, 4, 4, 0, 3, 7, 0, 2, 2, 2, 8, 0, 0, 0] => [2, 4, 3, 7, 0, 2, 8, 0]


Solution

  • I deleted my original answer; I finally understood "flocculate" in this context. Sorry ... I'm blinded by several years in ceramics.

    You're going to too much work, tagging things that do or don't match. SImply build a new list from the original. Add only Items that do not match the previous one.

    test_list = [
        [1, 4, 4, 2, 0, 3, 3, 3],
        [2, 4, 4, 0, 3, 7, 0, 2, 2, 2, 8, 0, 0, 0],
        [-122, 4, 14, 0, 3, 7, 0, 2, 2, -2, 8, 0, 0, 0, 9999]
    ]
    
    def flocculate(array):
    #    return list(set(array))
        result = []
        last = None
        for i in array:
            if i != last:
                result.append(i)
                last = i
        return result
    
    for array in test_list:
        print array, "\n    =>", flocculate(array)
    

    Output:

    [1, 4, 4, 2, 0, 3, 3, 3] 
        => [1, 4, 2, 0, 3]
    [2, 4, 4, 0, 3, 7, 0, 2, 2, 2, 8, 0, 0, 0] 
        => [2, 4, 0, 3, 7, 0, 2, 8, 0]
    [-122, 4, 14, 0, 3, 7, 0, 2, 2, -2, 8, 0, 0, 0, 9999] 
        => [-122, 4, 14, 0, 3, 7, 0, 2, -2, 8, 0, 9999]