Search code examples
pythondatabasecsvtextexport-to-csv

Convert complex txt to csv python script


I have a .txt file with this inside

Name: 321; 
Score:100; Used Time: 1:09:308;
GTime: 6/28/2024 10:04:18 PM;
Core Version : 21.0.0.0;
Software Version : 21.0.0.0;
AppID: 0S0; MapDispName: Future City; MapName:MapName MapName MapName;
Key:A0000-abcde-Q0000-F0000-00H00;  REG Date : 2/27/2021 1:16:34 PM; Expiry : 7/7/2024 12:00:00 AM

What I'm trying to do is convert that text into a .csv (table) using a python script. There are 300 files and hundreds of lines in each file. We only need to transform the information in the first 7 lines into csv. All of these 300 files have the same format but with different values.

What I would want the log.csv file to show is:

Name,Sore,Time,Software Ver,Core Ver,AppID,Key,REG Date,Expiry,MapName
321,100,69.308s,21.0.0.0,21.0.0.0,0S0,A0000-abcde-Q0000-F0000-00H00,2/27/2021 1:16:34 PM,7/7/2024 12:00:00 AM,MapName MapName MapName

How can I do it with python? Thanks.


Solution

  • Your current example shows that all values appear to follow the same format i.e. Key:Value;

    Use glob.glob() to iterate over all of your text filenames. You can use islice() to read in exactly 7 lines, then convert them into a single line. This can then be split on ; to give you a list of key value pairs. These can then be further split on the : and strip() applied to remove any extra spaces.

    Lastly make use of itemgetter() to extract only the elements you need from the resulting list.

    from itertools import islice, chain
    from operator import itemgetter
    import csv
    import glob
    import os
    
    get = itemgetter(1, 3, 5, 9, 11, 13, 19, 21, 23, 17)
    
    with open('log.csv', 'w', newline='') as f_output:
        csv_output = csv.writer(f_output)
        csv_output.writerow('Name,Sore,Time,Software Ver,Core Ver,AppID,Key,REG Date,Expiry,Filename'.split(','))
    
        for filename in glob.glob('*.txt', recursive=True):
            with open(filename) as f_input:
                data = ''.join(islice(f_input, 0, 7)).replace('\n', '').split(';')
                values = [v.strip() for v in chain.from_iterable(d.split(':', 1) for d in data)]
                csv_output.writerow([*get(values), os.path.basename(filename)])
    

    For your example, this would give you log.csv containing:

    Name,Sore,Time,Software Ver,Core Ver,AppID,Key,REG Date,Expiry,Filename
    321,100,1:09:308,21.0.0.0,21.0.0.0,0S0,A0000-abcde-Q0000-F0000-00H00,2/27/2021 1:16:34 PM,7/7/2024 12:00:00 AM,MapName MapName MapName,file1.txt