I am currently trying to export/convert a text document I have into an .xls file. So after from what i've found i was able to create an xls but now i just need to get the formatting correct in xls from the text document.
Heres an example of what I am trying to do.
Lets say I have the following text document: numbers.txt
|<DOg>|
|Data1 = 300 |
|Data2 = 200 |
|Data3 = 15 |
|Data4 = 14 |
|Data5 = 4 |
|<DOg>|
|Data1 = 800 |
|Data2 = 500 |
|Data3 = 25 |
|Data4 = 10 |
|Data5 = 5 |
if I run my code using |
as the delimiter I receive this as the .xls file
As you can see the formatting is off.
The goal am i trying to get is the following formatting instead.
The current code I am using is the following:
mypath = raw_input("Please enter the directory path for the input files: ")
from os import listdir
from os.path import isfile, join
textfiles = [ join(mypath,f) for f in listdir(mypath) if isfile(join(mypath,f)) and '.txt' in f]
def is_number(s):
try:
float(s)
return True
except ValueError:
return False
import xlwt
import xlrd
style = xlwt.XFStyle()
style.num_format_str = '#,###0.00'
for textfile in textfiles:
f = open(textfile, 'r+')
row_list = []
for row in f:
row_list.append(row.split('|'))
column_list = zip(*row_list)
# for column_list in f:
# column_list.append(column.split('|'))
workbook = xlwt.Workbook()
worksheet = workbook.add_sheet('Sheet1')
i = 0
for column in column_list:
for item in range(len(column)):
value = column[item].strip()
if is_number(value):
worksheet.write(item, i, float(value), style=style)
else:
worksheet.write(item, i, value)
i+=1
workbook.save(textfile.replace('.txt', '.xls'))
My idea was using the .split()
method for columns however I'm unsure how to implemented correctly since when I use split
for columns every single line ends up being it's own column.
It looks like you have unlimited columns. You need to capture all the results in an array and transpose them as follows:
import re
# Strip all spaces and dump all data into an array
lines = [mo for mo in re.findall('(?s)(?<=\|)([<\w].+?)\s+?\|', open('py.txt').read())]
# Create an array to hold the transformation
combined = ['' for x in range(len(lines) / lines.count("<DOg>|"))]
# Append by rows
for idx in range(len(lines)):
combined[idx % len(combined)] += lines[idx] + ','
# Write array to file
output = open('numbersConverted.csv','w')
for comb in combined:
output.write(comb + "\n")
output.close
This will dump your results in a numbersConverted.csv ready for import.