I am trying to write a script which will automate a copy/paste of employee time sheets from several files to one compiled file. Since they are time sheets with project codes some cells are left blank where an employee worked on a different project that day. Also the files have been converted from xlsx(2007) to .csv.xls which xlrd seems to open just fine.
I do know how to open and create a book object but my knowledge of this module is very limited so I thought maybe a general algorithm would be helpful:
import xlrd, xlwt
put all following in for or while loop to iterate through files:
book = xlrd.open_workbook('mybook.csv.xls')
extract data; store data for ouput
use for loop to iterate over data, output to final sheet
open next file, repeat process storing each output below the previous
I am looking for anything that will help me find the answers, not just code. Any help would be appreciated. Thanks.
This might help ... it reproduces your data as closely as possible (dates remain as dates, empty cells don't become text cells with 0-length contents, booleans and error cells don't become number cells).
from xlrd import XL_CELL_EMPTY, XL_CELL_TEXT, XL_CELL_NUMBER,
XL_CELL_DATE, XL_CELL_BOOLEAN, XL_CELL_ERROR, open_workbook
from xlwt import Row, easyxf, Workbook
method_for_type = {
XL_CELL_TEXT: Row.set_cell_text,
XL_CELL_NUMBER: Row.set_cell_number,
XL_CELL_DATE: Row.set_cell_number,
XL_CELL_ERROR: Row.set_cell_error,
XL_CELL_BOOLEAN: Row.set_cell_boolean,
}
date_style = easyxf(num_format_str='yyyy-mm-dd')
other_style = easyxf(num_format_str='General')
def append_sheet(rsheet, wsheet, wrowx=0):
for rrowx in xrange(rsheet.nrows):
rrowvalues = rsheet.row_values(rrowx)
wrow = wsheet.row(wrowx)
for rcolx, rtype in enumerate(rsheet.row_types(rrowx)):
if rtype == XL_CELL_EMPTY: continue
wcolx = rcolx
wmethod = method_for_type[rtype]
wstyle = date_style if rtype == XL_CELL_DATE else other_style
wmethod(wrow, wcolx, rrowvalues[rcolx], wstyle)
wrowx += 1
return wrowx
if __name__ == '__main__':
import sys, xlrd, xlwt, glob
rdpattern, wtfname = sys.argv[1:3]
wtbook = Workbook()
wtsheet = wtbook.add_sheet('guff')
outrowx = 0
for rdfname in glob.glob(rdpattern):
rdbook = open_workbook(rdfname)
rdsheet = rdbook.sheet_by_index(0)
outrowx = append_sheet(rdsheet, wtsheet, outrowx)
print outrowx
wtbook.save(wtfname)