Search code examples
pythonlistxlrd

Read files from list of files


Have been googling this question for quite a while now, but cannot seem to find a solution. I use the excelfiles-function to create a list of all excelfiles including "tst" in their name in a specified directory. After that I want to read certain cells from each document with the locate_vals-function, but I can't seem to reaf files from a list of files. Maybe there is a really simple solution to this that I just cannot see? The errormessage I get is at the bottom.

This is a part of a bigger task I asked for help for yesterday("Search through directories for specific Excel files and compare data from these files with inputvalues"), but as I can't seem to find any answer to this question I thought it might would be best to give it a thread of it's own. Correct me if I'm wrong and I'll remove it:)

import xlrd
import os, fnmatch

#globals

start_dir = 'C:/eclipse/TST-folder'

def excelfiles(pattern):
    file_list = []
    for root, dirs, files in os.walk(start_dir):
        for filename in files:
            if fnmatch.fnmatch(filename.lower(), pattern):
                if filename.endswith(".xls") or filename.endswith(".xlsx") or filename.endswith(".xlsm"):
                    file_list.append(os.path.join(root, filename))
    return file_list

file_list = excelfiles('*tst*')     # only accept docs hwom title includes tst
for i in file_list: print i


'''Location of each val from the excel spreadsheet'''


 def locate_vals():
     val_list = []
     for file in file_list:
         wb = xlrd.open_workbook(os.path.join(start_dir, file))
         sheet = wb.sheet_by_index(0)
         for vals in file:
             weightvalue = file_list.sheet.cell(3, 3).value
             lenghtvalue = sheet.cell(3, 2).value
             speedval = sheet.cell(3, 4).value

Errormessage:

Traceback (most recent call last):
  File "C:\Users\Håvard\Documents\Skulearbeid\UMB\4. Semester Vår\Inf120 Programmering og databehandling\Workspace\STT\tst_mainsheet.py", line 52, in <module>
    print locate_vals()
  File "C:\Users\Håvard\Documents\Skulearbeid\UMB\4. Semester Vår\Inf120 Programmering og databehandling\Workspace\STT\tst_mainsheet.py", line 48, in locate_vals
    weightvalue = file_list.sheet.cell(3, 3).value
AttributeError: 'list' object has no attribute 'sheet'

Solution

  • The problem showed by your traceback is indeed that this:

    weightvalue = file_list.sheet.cell(3, 3).value
    

    should be this:

    weightvalue = sheet.cell(3, 3).value
    

    However, there were more problems in your code. I've made minor fixes and marked them in the comments:

    import xlrd
    import os, fnmatch
    
    start_dir = 'C:/eclipse/TST-folder'
    
    def excelfiles(pattern):
        file_list = []
        for root, dirs, files in os.walk(start_dir):
            for filename in files:
                if fnmatch.fnmatch(filename.lower(), pattern):
                    if filename.endswith(".xls") or filename.endswith(".xlsx") or filename.endswith(".xlsm"):
                        file_list.append(os.path.join(root, filename))
        return file_list
    
    file_list = excelfiles('*tst*')     # only accept docs hwom title includes tst
    for i in file_list: print i
    
    
    '''Location of each val from the excel spreadsheet'''
    
    
    def locate_vals():
        val_dict = {}
        for filename in file_list:
            wb = xlrd.open_workbook(os.path.join(start_dir, filename))
            sheet = wb.sheet_by_index(0)
    
            # problem 2: extract these values once per sheet
            weightvalue = sheet.cell(3, 3).value
            lengthvalue = sheet.cell(3, 2).value
            speedvalue = sheet.cell(3, 4).value
    
            # problem 3: store them in a dictionary, keyed on filename
            val_dict[filename] = [weightvalue, lengthvalue, speedvalue]
    
        # dictionary keyed on filename, with value a list of the extracted vals
        return val_dict
    
    print locate_vals()