find common list between files

I have three text file:

fileA:

13  abc
123 def
234 ghi
1234    jkl
12  mno

fileB:

12  abc
12  def
34  qwe
43  rty
45  mno

fileC:

12  abc
34  sdg
43  yui
54  poi
54  def

I would like to see what all the values in the 2nd column are matching between the files. The following code works if the 2nd column is already sorted. but if the 2nd column is not sorted, how do i sort the 2nd column and compare the files ?

fileA = open("A.txt",'r')
fileB = open("B.txt",'r')
fileC = open("C.txt",'r')

listA1 = []
for line1 in fileA:
    listA = line1.split('\t')
    listA1.append(listA)


listB1 = []
for line1 in fileB:
    listB = line1.split('\t')
    listB1.append(listB)


listC1 = []
for line1 in fileC:
    listC = line1.split('\t')
    listC1.append(listC)

for key1 in listA1:
    for key2 in listB1:
        for key3 in listC1:
            if key1[1] == key2[1] and key2[1] == key3[1] and key3[1] == key1[1]:
                print "Common between three files:",key1[1]

print "Common between file1 and file2 files:"
for key1 in listA1:
    for key2 in listB1:
        if key1[1] == key2[1]:
            print key1[1]

print "Common between file1 and file3 files:"
for key1 in listA1:
    for key2 in listC1:
        if key1[1] == key2[1]:
            print key1[1]

Solution

If you just want to sort A1, B1, and C1 by the second column, this is easy:

listA1.sort(key=operator.itemgetter(1))

If you don't understand itemgetter, this is the same:

listA1.sort(key=lambda element: element[1])

However, I think a better solution is to just use a set:

setA1 = set(element[1] for element in listA1)
setB1 = set(element[1] for element in listB1)
setC1 = set(element[1] for element in listC1)

Or, more simply, don't build the lists in the first place; do this:

setA1 = set()
for line1 in fileA:
    listA = line1.split('\t')
    setA1.add(listA[1])

Either way:

print "Common between file1 and file2 files:"
for key in setA1 & setA2:
    print key

To simplify it further, you probably want to refactor the repeated stuff into functions first:

def read_file(path):
    with open(path) as f:
        result = set()
        for line in f:
            columns = line.split('\t')
            result.add(columns[1])
    return result

setA1 = read_file('A.txt')
setB1 = read_file('B.txt')
setC1 = read_file('C.txt')

And then you can find further opportunities. For example:

def read_file(path):
    with open(path) as f:
        return set(row[1] for row in csv.reader(f))

As John Clements points out, you don't even really need all three of them to be sets, just A1, so you could instead do this:

def read_file(path):
    with open(path) as f:
        for row in csv.reader(f):
            yield row[1]

setA1 = set(read_file('A.txt'))
iterB1 = read_file('B.txt')
iterC1 = read_file('B.txt')

The only other change you need is that you have to call intersection instead of using the & operator, so:

for key in setA1.intersection(iterB1):

I'm not sure this last change is actually an improvement. But in Python 3.3, where the only thing you need to do is change the return set(…) into yield from (…), I probably would do it this way. (Even if the files are huge and have tons of duplicates, so there was a performance cost to it, I'd just stick unique_everseen from the itertools recipes around the read_file calls.)