I am making a script in Python to take some scripts and put them in a determined size (37 in this case), adding some white spaces if necessary. I finished it and I thought it worked pretty well but then I saw that changing the encodings from Ansi to Utf-8 and viceversa the spaces has some trash. Here it is my code:
from __future__ import print_function
from openpyxl import load_workbook
import sys
import openpyxl
import codecs
doc_name = input('Enter Excel document name: ')
wb = load_workbook(doc_name + '.xlsx')
ws = wb.active
sql = open(doc_name + '.sql','w')
column = input('Enter column where I can find gescal (usually B): ')
i = 1
while True:
'''If cell empty finish'''
cell = ws[column + str(i)].value
if cell == None:
break
'''Calculate Gescal 37'''
lengthcell = len(cell)
if lengthcell < 37:
gescal37 = cell + (' '*(37-lengthcell))
elif lengthcell > 37:
gescal37 = cell[:37]
else:
gescal37 = cell
'''Calculate gescal 17'''
gescal17 = gescal37[:17]
'''Write it in the document'''
sql.write('update installationuser set GESCAL37 = \'' + gescal37 + '\' where GESCAL37 = \'' + gescal17 + '\';\n')
i += 1
sql.close()
I tried to open the document with the utf-8 encoding, but I had the same problem in the other way. It looks pretty good with encoding utf-8 but when I tried to show it with the Ansi encoding... bang! the white spaces had trash.
\xa0
is a non-breaking space.
Try to use gescal = gescal.replace(u'\xa0', ' ')
to replace it with a space.