Search code examples
pythonlogparserlog-analysis

Parsing and Formatting log file into Structured format in Python


I have a log file which has 1000+ line in the below format

<date> <time 1> {serial_no 1} {event:...} <message 1> 
<date> <time 2> {serial_no 2} {event:...} <message 2> 
<date> <time 3> {serial_no 3} {event:...} <message 3> 
..
..
..
<date> <timen> {serial_non} {event:...} <message n>

I need to extract only particular message and its corresponding date and time and store those in a file.
Also need to extract those message which contains the keywords input from the command line.
Command line arguments can accept any number of keywords.
Example: >python file.py -k <key 1> -k <key 2> -k <key 3> Output must contain all those messages with the input keywords with corresponding time and date. \ Need to format particular message from log file to simple sentence while giving the output since in log file the messages sentence are complicated and difficult to understand.
Would like to know which open source libraries in python can be used to format the above log file into structured format.


Solution

  • Your not going to get a library that does everything for you.

    But the csv library does contain a lot of good split by delimiter tools which would let you parse out your data for interrogation via your own code within the .py.

    Or you could use pandas, load it into a dataframe and query the dataframe (again, you customising the query within the .py).

    See below for reading the file into pandas using read_csv

    https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html