Search code examples
pythonrowsspreadsheettranspose

python text file to rows and columns


So I have been trying at this for a while and seem to have hit a road block and would like an assist.

I have a couple text files. Without writing it all out here's an example:

2020
Grum Grum
Stamina: 20
Agility: 23
Strength: 20.5%
Resistances: 20-21-30

2020
Mondo Silo
Stamina: 23
Agility: 13
Strength: 10.5%
Resistances: 20-21-20

And so on and so forth. Some are like this every 6 lines it starts a new stats file, and some text files have it so there are every 10 lines a new stats sheet is there.

My goal is to make every time the stats sheet end, to put it into a row and columns. I think it's called transposed, in spreadsheets terminology, but idk what I'm doing wrong. Or even if that's correct to say..

As an example I'd like the file to look like this when I'm done.

Year | Name | Stamina | Agility | Str | Res
2020 | Grum Grum | Stamina: 20 | Agility: 23 | Strength: 20.5% | Resistances: 20-21-30

I've tried Numpy, Pandas, and idk what I'm doing wrong and honestly don't know what to search to find the right answers.

Would appreciate it if I could get any help, these files are very big and I'd like to be able to specific which number of columns I need the stats sheet to fill.

Thank you in advance if you can help.


Solution

  • You could try this to get the dataframe desired:

    with open(r'test1.txt','r') as file:
        data=file.read().split('\n\n')
    data=[i.split('\n') for i in data]
    df=pd.DataFrame(data,columns=['Year','Name','Stamina','Agility','Str','Res'])
    
    print(df)
    

    Output:

       Year        Name  ...              Str                    Res
    0  2020   Grum Grum  ...  Strength: 20.5%  Resistances: 20-21-30
    1  2020  Mondo Silo  ...  Strength: 10.5%  Resistances: 20-21-20
    2  2020   Grum Grum  ...  Strength: 20.5%  Resistances: 20-21-30
    3  2020  Mondo Silo  ...  Strength: 10.5%  Resistances: 20-21-20
    

    And to write the dataframes of a list of .txt files with different number of rows, and with the same structure you can try:

    Option 1

    import pandas as pd
    
    files=['test1.txt','test2.txt']                     #list of files
    
    df=pd.DataFrame(columns=['Year','Name','Stamina','Agility','Str','Res'])  #create the dataframe
    
    for file in files:                                  #we open each file
        with open(r'path_of_files'+file,'r') as file_r:   
            data=file_r.read().strip().split('\n\n')
            data=[i.split('\n') for i in data if i!=''] #get the rows
            print(data)
            s = pd.DataFrame(data, columns=df.columns)  
            df =pd.concat([df, s], ignore_index=True)   #we append the new rows to the dataframe
            
            
    print(df)
    df.to_csv(r'test3.txt', sep='|', index=False)       #write the final dataframe to the output file('test3.txt'), with '|' as separator 
    

    Option 2

    import pandas as pd
    
    files=['test1.txt','test2.txt']                      #list of files
    
    for file in files:                                   #we open each file
        with open(r'path_of_files'+file,'r') as file_r, open(r'test3.txt', 'a') as fout:
            data=file_r.read().strip().split('\n\n')
            data=[i.split('\n') for i in data if i!='']
            df=pd.DataFrame(data,columns=['Year','Name','Stamina','Agility','Str','Res'])   #create a dataframe with the data of the current file
            if files.index(file)==0:
                fout.write(df.to_string( index = False)) #we let header=true to the first iteration to write the columns, and also write the data
            else:
                fout.write(df.to_string(header = False, index = False))  #we write the dataframe without the index and the columns names
            fout.write('\n')                             #a newline to place correctly the next rows
    

    EXAMPLE
    With some dummy files like the ones below(test1.txt,test2.txt), you can see the result(test3.txt) with both options:

    test1.txt

    2020
    Grum Grum
    Stamina: 20
    Agility: 23
    Strength: 20.5%
    Resistances: 20-21-30
    
    2020
    Mondo Silo
    Stamina: 23
    Agility: 13
    Strength: 10.5%
    Resistances: 20-21-20
    

    test2.txt

    2020
    Grum Grum
    Stamina: 20
    Agility: 23
    Strength: 20.5%
    Resistances: 20-21-30
    
    2020
    Mondo Silo
    Stamina: 23
    Agility: 13
    Strength: 10.5%
    Resistances: 20-21-20
    
    2020
    Mondo Silo
    Stamina: 23
    Agility: 13
    Strength: 10.5%
    Resistances: 20-21-20
    
    2020
    Mondo Silo
    Stamina: 23
    Agility: 13
    Strength: 10.5%
    Resistances: 20-21-20
    

    test3.txt(Output file) with Option 1

    Year|Name|Stamina|Agility|Str|Res
    2020|Grum Grum|Stamina: 20|Agility: 23|Strength: 20.5%|Resistances: 20-21-30
    2020|Mondo Silo|Stamina: 23|Agility: 13|Strength: 10.5%|Resistances: 20-21-20
    2020|Grum Grum|Stamina: 20|Agility: 23|Strength: 20.5%|Resistances: 20-21-30
    2020|Mondo Silo|Stamina: 23|Agility: 13|Strength: 10.5%|Resistances: 20-21-20
    2020|Mondo Silo|Stamina: 23|Agility: 13|Strength: 10.5%|Resistances: 20-21-20
    2020|Mondo Silo|Stamina: 23|Agility: 13|Strength: 10.5%|Resistances: 20-21-20
    

    test3.txt(Output file) with Option 2

     Year        Name      Stamina      Agility              Str                    Res
     2020   Grum Grum  Stamina: 20  Agility: 23  Strength: 20.5%  Resistances: 20-21-30
     2020  Mondo Silo  Stamina: 23  Agility: 13  Strength: 10.5%  Resistances: 20-21-20
     2020   Grum Grum  Stamina: 20  Agility: 23  Strength: 20.5%  Resistances: 20-21-30
     2020  Mondo Silo  Stamina: 23  Agility: 13  Strength: 10.5%  Resistances: 20-21-20
     2020  Mondo Silo  Stamina: 23  Agility: 13  Strength: 10.5%  Resistances: 20-21-20
     2020  Mondo Silo  Stamina: 23  Agility: 13  Strength: 10.5%  Resistances: 20-21-20