Search code examples
pythonlistsettuplesset-theory

Set theory magic on list of tuples


In python I have two lists A and B. Both lists consist of tuples (x,y). For example:

A = [('x1','y1'), ('x2','y2'), ('x3','y3')]
B = [('x1','y1'), ('x2','y5'), ('x4','y4')]

Now, there are three results I want. All of them are easily solvable with set theory, if only there were no tuples involved.

Result 1: Intersection of both lists:set(A) & set(B)). So result should be comparing both values of the tuples of both lists. Result should be: C = [('x1','y1')]

Result 2: Intersection of both lists where only the (x,y)[0] matches. Result should be: D = (('x1','y1'), ('x2', ('y2', 'y5'))]. Ideally the solution is D - C -> E = [('x2', ('y2', 'y5'))] but I can live with having D itself.

Result 3: The uniques of list B compared to A: set(B)-(set(A) & set(B)). Only compared on (x,y)[0]. Result should be: [('x4', 'y4')].

I couldn't find anything on these problems, and wasn't able to construct a solution myself. Can anyone help?


Solution

  • Here are some ways to do what you want using dicts. This is Python 2 code; it will need some minor modification for Python 3. IIRC, Python 3 doesn't have dict.iteritems() since its dict.items() returns an iterator instead of a list.

    A = [('x1','y1'), ('x2','y2'), ('x3','y3')]
    B = [('x1','y1'), ('x2','y5'), ('x4','y4')]
    
    dA = dict(A)
    dB = dict(B)
    
    #Intersection, the simple way
    print 'Result 1a:', list(set(A) & set(B))
    
    #Intersection using dicts instead of sets
    result = [(k, vA) for k, vA in dA.iteritems() if dB.get(k) == vA]
    print 'Result 1b:', result
    
    #match on 1st tuple element, ignoring 2nd element
    result = {}
    for k, vA in dA.iteritems():
        vB = dB.get(k)
        if vB:
            result[k] = (vA, vB) if vB != vA else vA
    print 'Result 2a:', result.items()
    
    #match on 1st tuple element only if 2nd elements don't match
    result = {}
    for k, vA in dA.iteritems():
        vB = dB.get(k)
        if vB and vB != vA:
            result[k] = (vA, vB)
    print 'Result 2b:', result.items()
    
    #unique elements of B, ignoring 2nd element
    result = [(k, vB) for k, vB in dB.iteritems() if k not in dA]
    print 'Result  3:', result
    

    output

    Result 1a: [('x1', 'y1')]
    Result 1b: [('x1', 'y1')]
    Result 2a: [('x2', ('y2', 'y5')), ('x1', 'y1')]
    Result 2b: [('x2', ('y2', 'y5'))]
    Result  3: [('x4', 'y4')]