I have a log file in the .txt format and wanted to convert it into. CSV format with header is there any way I can store data from .txt to.csv in a specific format. I have attached both the log file and require CSV file. Link to log file here
New Data Received at : 2022-06-07 17:19:42.714867
$ 1 SECURE SAM1.0 DT H 8.64337E+14 TN04E1234 0 0 0 0 0 0 0 0 0 0 0 0 0 BSNL 1 1 15 0 0 W 0 0 0 0 0 0 0 0 1 59*
$ 1 SECURE SAM1.0 DT H 8.64337E+14 TN04E1234 0 0 0 0 0 0 0 0 0 0 0 0 0 BSNL 1 1 15 0 0 W 0 0 0 0 0 0 0 0 2 60*
$ 1 SECURE SAM1.0 DT H 8.64337E+14 TN04E1234 0 0 0 0 0 0 0 0 0 0 0 0 0 BSNL 1 1 15 0 0 W 0 0 0 0 0 0 0 0 3 61*
$ 1 SECURE SAM1.0 DT L 8.64337E+14 TN04E1234 0 6012080 50 0 0 0 0 0 0 0 0 0 0 BSNL 1 1 15 0 0 W 0 0 0 0 0 0 0 0 4 88*
$ 1 SECURE SAM1.0 DT L 8.64337E+14 TN04E1234 0 6012080 100 0 0 0 0 0 0 0 0 0 0 BSNL 1 1 15 0 0 W 0 0 0 0 0 0 0 0 5 85*
$ 1 SECURE SAM1.0 DT L 8.64337E+14 TN04E1234 0 6012080 110 0 0 0 0 0 0 0 0 0 0 BSNL 1 1 15 0 0 W 0 0 0 0 0 0 0 0 6 87*
Require format in .CSV file link of sample CSV file here
I'll use the small log file part provided in your question.
First we need to remove descriptions.
I assume that all entries are started with $
. And in your sample log, fields are separated with spaces. So I split lines with spaces.
import pandas as pd
import numpy as np
file = open('log.txt','r')
s= file.read()
lines = s.split('\n')
filt_lines =[]
for l in lines:
if l.startswith('$'):
filt_lines.append(l)
array = [l.split() for l in filt_lines]
log_df = pd.DataFrame(array)
# And write it to csv
log_df.to_csv("log.csv")
The result is
0 1 2 3 4 5 6 7 8 9 ... 28 29 30 31 32 33 34 35 36 37
0 $ 1 SECURE SAM1.0 DT H 8.64337E+14 TN04E1234 0 0 ... 0 0 0 0 0 0 0 0 1 59*
1 $ 1 SECURE SAM1.0 DT H 8.64337E+14 TN04E1234 0 0 ... 0 0 0 0 0 0 0 0 2 60*
2 $ 1 SECURE SAM1.0 DT H 8.64337E+14 TN04E1234 0 0 ... 0 0 0 0 0 0 0 0 3 61*
3 $ 1 SECURE SAM1.0 DT L 8.64337E+14 TN04E1234 0 6012080 ... 0 0 0 0 0 0 0 0 4 88*
4 $ 1 SECURE SAM1.0 DT L 8.64337E+14 TN04E1234 0 6012080 ... 0 0 0 0 0 0 0 0 5 85*
5 $ 1 SECURE SAM1.0 DT L 8.64337E+14 TN04E1234 0 6012080 ... 0 0 0 0 0 0 0 0 6 87*
There is some work you need to do to the log_df to get the required csv file before writing it in to csv. There are some columns in required csv example which are not in the log file. You have to check them and manually create those columns. Later add index to the dataframe and write.