Search code examples
listpython-2.7comparisonpython-itertools

What is the most efficient method to do a list comparison between a simple list A and a list of list B?


I have list A:

['0.0720', '0.0200', '0.0260', '0.0210', '0.0740', '0.0510', '0.0160']

and also list B:

[
[0.074, 0.073, 0.072, 0.03, 0.029, 0.024, 0.021, 0.02], 
[0.02, 0.015], [0.026, 0.02, 0.015], 
[0.021, 0.02, 0.017], 
[0.077, 0.076, 0.074, 0.055, 0.045, 0.021], 
[0.053, 0.052, 0.051, 0.023, 0.022], 
[0.016]
]

What is the most efficient method to compare the first element of A with the first sub-list of B, compare the first element of A with the second sub-list of B, compare the third element of A with the third sub-list of B, ..., and remove the corresponding element from the sub-list in B if they match and the sub-list contains 2 or more elements?


Solution

  • Using zip() you can pair up elements from two lists:

    for a, b in zip(A, B):
        # a is an element from A, b is a sublist from B.
    

    Your sublists contain floating point values, and your list A contains strings. You'll need to figure out a tolerance for comparisons. Perhaps turning the floats to strings with the matching precision would suffice?

    for a, b in zip(A, B):
        # a is an element from A, b is a sublist from B.
        b[:] = [i for i in b if format(i, '.4f') != a]
    

    Using a slice assignment (b[:]) we replace the contents of the sublist with all elements that do not match a at 4 digits after the decimal.

    Running that on your example input gives me:

    [
    [0.074, 0.073, 0.03, 0.029, 0.024, 0.021, 0.02], 
    [0.015], 
    [0.02, 0.015], 
    [0.02, 0.017], 
    [0.077, 0.076, 0.055, 0.045, 0.021], 
    [0.053, 0.052, 0.023, 0.022], 
    []
    ]
    

    If you only want the first match to be removed, use:

    try:
        del b[next(i for i, e in enumerate(b) if format(e, '.4f') == a)]
    except StopIteration:
        pass
    

    This'll find the first index that matches, and remove that from b. Result (in this case exactly the same as before):

    [[0.074, 0.073, 0.03, 0.029, 0.024, 0.021, 0.02],
     [0.015],
     [0.02, 0.015],
     [0.02, 0.017],
     [0.077, 0.076, 0.055, 0.045, 0.021],
     [0.053, 0.052, 0.023, 0.022],
     []]