Search code examples

How to read complex txt file with blocks of data and save it as csv file in python?

If i have a file organized like this

Country 1

**this sentence is not important.
**date 25.09.2017, also not important

        Address A, 100 City. Country X
**work time 09h00-16h00<br>9h00-14h00
**012/345 67 89
**téléfax 123/456 67 89
**Home Office

        Address A, 200 City. Country X
**001/000 00 00
**téléfax 111/111 11 11
**Living address

        Address 0, 123 City
**000/000 00 00
**téléfax 222/222 22 22
Country 2

**this sentence is not important.
**date 25.09.2017, also not important

        AAA 11, 30 City 

        BBB 22, 30 City
**work time 08h00-12h30  
**000/000 00 00
**téléfax 111/11 11 11


And i want to put data in csv file with these columns:

Country (Line right after ++++++++++++++), Address (Line right after *******), Office (after **), WorkTime (after **), Website (after **), Email (after **), Phone (after **), Fax (after **)

How do I do it in Python? Problem is, in some lists there is missing data, so i know some rows in csv file will end up all messed up, but i don't mind doing some manual work tweaking the database after i do this. Another problem is, country names vary, so i would need to use ++++++++++++++ as separator.

I tried something like this

import csv
with open('listofdata.txt', 'r') as FILE:
   DATA =

LIST = DATA.split('++++++++++++++')

LIST2 = []
LIST3 = []
LIST4 = []

for ITEMS in LIST:
    LIST2 = ITEMS.split('*******')    
    for items2 in LIST2:
        LIST3 = items2.split('**')

with open('file.csv', 'w') as CSV:
    for ITEMS in LIST4:

But it doesn't work.

ERROR: `Traceback (most recent call last): File "", line 22, in csv.write(ITEMS) AttributeError: 'module' object has no attribute 'write'



  • In the very last line you wrote your file object "csv" instead of "CSV", that was the reason there was an error.

    I added the procedure on how to use the csv module within python to your code.

    All you have to do now is work on your parsing method.


    import csv
    with open('listofdata.txt', 'r') as FILE:
       DATA =
    LIST = DATA.split('++++++++++++++')
    LIST2 = []
    LIST3 = []
    LIST4 = []
    for ITEMS in LIST:
        LIST2 = ITEMS.split('*******')
        for items2 in LIST2:
            LIST3 = items2.split('**')
    with open('file.csv', 'w') as csvfile:
        spamwriter = csv.writer(csvfile, delimiter=',')
        for ITEMS in LIST4:


    Country 1
    ","this sentence is not important.
    ","date 25.09.2017, also not important
            Address A, 100 City. Country X
    ","work time 09h00-16h00<br>9h00-14h00
    ","012/345 67 89
    ","téléfax 123/456 67 89
    ","Home Office
            Address A, 200 City. Country X
    ","001/000 00 00
    ","téléfax 111/111 11 11
    ","Living address
            Address 0, 123 City
    ","000/000 00 00
    ","téléfax 222/222 22 22
    Country 2
    ","this sentence is not important.
    ","date 25.09.2017, also not important
            AAA 11, 30 City
            BBB 22, 30 City
    ","work time 08h00-12h30
    ","000/000 00 00
    ","téléfax 111/11 11 11