Search code examples
pythonpandasnumpyscriptingexport-to-csv

Is there any way to convert Log file to CSV using python?


I have a log file in the .txt format and wanted to convert it into. CSV format with header is there any way I can store data from .txt to.csv in a specific format. I have attached both the log file and require CSV file. Link to log file here

New Data Received at : 2022-06-07 17:19:42.714867                                                                                                                                                   
                                                                                                                                                    
$   1   SECURE  SAM1.0  DT  H   8.64337E+14 TN04E1234   0   0   0   0   0   0   0   0   0   0   0   0   0   BSNL    1   1   15  0   0   W   0   0   0   0   0   0   0   0   1   59*
$   1   SECURE  SAM1.0  DT  H   8.64337E+14 TN04E1234   0   0   0   0   0   0   0   0   0   0   0   0   0   BSNL    1   1   15  0   0   W   0   0   0   0   0   0   0   0   2   60*
$   1   SECURE  SAM1.0  DT  H   8.64337E+14 TN04E1234   0   0   0   0   0   0   0   0   0   0   0   0   0   BSNL    1   1   15  0   0   W   0   0   0   0   0   0   0   0   3   61*
$   1   SECURE  SAM1.0  DT  L   8.64337E+14 TN04E1234   0   6012080 50  0   0   0   0   0   0   0   0   0   0   BSNL    1   1   15  0   0   W   0   0   0   0   0   0   0   0   4   88*
$   1   SECURE  SAM1.0  DT  L   8.64337E+14 TN04E1234   0   6012080 100 0   0   0   0   0   0   0   0   0   0   BSNL    1   1   15  0   0   W   0   0   0   0   0   0   0   0   5   85*
$   1   SECURE  SAM1.0  DT  L   8.64337E+14 TN04E1234   0   6012080 110 0   0   0   0   0   0   0   0   0   0   BSNL    1   1   15  0   0   W   0   0   0   0   0   0   0   0   6   87*
                                                                                                    

Require format in .CSV file link of sample CSV file here


Solution

  • I'll use the small log file part provided in your question.

    First we need to remove descriptions. I assume that all entries are started with $. And in your sample log, fields are separated with spaces. So I split lines with spaces.

    import pandas as pd
    import numpy as np
    
    file = open('log.txt','r')
    s= file.read()
    lines = s.split('\n')
    
    filt_lines =[]
    for l in lines:
        if l.startswith('$'):
            filt_lines.append(l)
    
    array = [l.split() for l in filt_lines]
    
    log_df = pd.DataFrame(array)
    
    # And write it to csv
    log_df.to_csv("log.csv")
    

    The result is

    0   1   2   3   4   5   6   7   8   9   ...     28  29  30  31  32  33  34  35  36  37
    0   $   1   SECURE  SAM1.0  DT  H   8.64337E+14     TN04E1234   0   0   ...     0   0   0   0   0   0   0   0   1   59*
    1   $   1   SECURE  SAM1.0  DT  H   8.64337E+14     TN04E1234   0   0   ...     0   0   0   0   0   0   0   0   2   60*
    2   $   1   SECURE  SAM1.0  DT  H   8.64337E+14     TN04E1234   0   0   ...     0   0   0   0   0   0   0   0   3   61*
    3   $   1   SECURE  SAM1.0  DT  L   8.64337E+14     TN04E1234   0   6012080     ...     0   0   0   0   0   0   0   0   4   88*
    4   $   1   SECURE  SAM1.0  DT  L   8.64337E+14     TN04E1234   0   6012080     ...     0   0   0   0   0   0   0   0   5   85*
    5   $   1   SECURE  SAM1.0  DT  L   8.64337E+14     TN04E1234   0   6012080     ...     0   0   0   0   0   0   0   0   6   87*
    

    There is some work you need to do to the log_df to get the required csv file before writing it in to csv. There are some columns in required csv example which are not in the log file. You have to check them and manually create those columns. Later add index to the dataframe and write.