Search code examples
pythonopenoffice.org

python to auto fill in import text in open office


Import Text

(character set ,seperator options, and Fields in Apache Open Office)

The original file I have is a csv file. I want to change the character set ,seperator options, and Fields using python code and save it into an excel file. Is it possible to auto fill in these options? Or else what language can I use to automate this action? ( I cant do this in excel because excel will delete some of my special character.)


Solution

  • You can use xlsxwriter module to make XLSX files: https://xlsxwriter.readthedocs.io/index.html

    Assume you have a CSV file with encoding CP1251 and you want to get XLSX file with encoding UTF8. Here is how it can be done:

    import xlsxwriter # pip3 install xlsxwriter
    
    # get data from the csv file with non utf8 encoding
    with open('data_cp1251.csv', 'r', encoding='cp1251') as f:
        data = f.read()
    
    # convert the data into 2d array
    table = [row.split(",") for row in data.split("\n")]
    
    # create xlsx file (utf8 encoding by default)
    ss = xlsxwriter.Workbook('data.xlsx')
    s = ss.add_worksheet()
    
    # fill the xlsx file with the 2d array
    for row_num, row in enumerate(table):
        for col_num, cell in enumerate(row):
            s.write(row_num, col_num, cell)
    
    ss.close() # here you get the 'data.xlsx' file
    

    For simply cases it works even if the source CSV file has tab \t separators. But it's need to test on your real data.

    And as far as I can tell all fields in the new xlsx file are 'text fields' by default. You can change their formats any time, see here: https://xlsxwriter.readthedocs.io/format.html#format