Search code examples
pythonautomationopenoffice-calc

Is there an Excel function or a script to find and replace fields that match in two different sheets by another data contained in the second sheet?


Today is my 1st time using with Open Office and I need some help to get started. My boss gave me two Excel files containing data I have to complete.

  • The 1st sheet contains in 'A' a supplier code and a bunch a other columns with datas like the product dimensions that I need to keep.
  • The 2nd contains in 'A' the same supplier code, tho not in the same order as there is also a lot of them in double, and it has in 'B' a store code.

I need to add a column to the 1st sheet containing the store code if there is a match in both sheet with the code contained in 'A'.

It's not really my job but since everyone is gone for the summer he charged me to do it.

My problem is that both sheets are over 12 000 lines longs, and I know that there is only 700 to 800 articles that will have a match. And since its my 1st time using Excel/OpenOffice (I know I know...) I was wondering if there was a way to automate this work, either with OpenOffice or if I could use a script to do it. I've found a lot similar post but none of them is quite what I need.

Any help is welcome. Thx !


Solution

  • I would first try to let it run disregarding memory concerns and only fix that problem if you encounter it. 12,000 seems like a lot, but you might be surprised at what you can throw at python and have it "just work". I almost exclusively use csv files in programming when I encounter excel or the like...

    import csv
    
    #    B.csv:
    #    store#  part#
    #      xx     xx
    #      xx     xx
    #      xx     xx
    #    ...
    
    partNums = {}
    with open('B.csv') as bfile:
        breader = csv.DictReader(bfile)
        for row in breader:
            partNums[row['part#']] = row['store#']
    
    #    A.csv
    #    part#  tag1  tag2  tag3 ...
    #    xx     xx    xx    xx   ...
    #    xx     xx    xx    xx   ...
    #    xx     xx    xx    xx   ...
    #    ...
    
    
    with open('outfile.csv', 'wb') as outfile:
        with open('A.csv', 'rb') as afile:
            areader = csv.reader(afile)
            outwriter = csv.writer(outfile)
            headers = areader.next()
            headers.append('StoreCode')
            outwriter.writerow(headers)
            for row in areader:
                if row[0] in partNums:  #assuming first row is part#
                    row.append(partNums[row[0]])
                else:  #value if no store number exists
                    row.append('')
                outwriter.writerow(row)