I've just started to play with Python and I'm trying to do some tests on my environment ... the idea is trying to create a simple script to find the recurrence of errors in a given period of time.
Basically I want to count the number of times a server fails on my daily logs, if the failure happens more than a given number of times (let's say 10 times) over a given period of time (let's say 30 days) I should be able to raise an alert on a log, but, I´m not trying to just count the repetition of errors on a 30 day interval... What I would actually want to do is to count the number of times the error happened, recovered and them happened again, this way I would avoid reporting more than once if the problem persists for several days.
For instance, let's say :
file_2016_Oct_01.txt@hostname@YES
file_2016_Oct_02.txt@hostname@YES
file_2016_Oct_03.txt@hostname@NO
file_2016_Oct_04.txt@hostname@NO
file_2016_Oct_05.txt@hostname@YES
file_2016_Oct_06.txt@hostname@NO
file_2016_Oct_07.txt@hostname@NO
Giving the scenario above I want the script to interpret it as 2 failures instead of 4, cause sometimes a server may present the same status for days before recovering, and I want to be able to identify the recurrence of the problem instead of just counting the total of failures.
For the record, this is how I'm going through the files:
# Creates an empty list
history_list = []
# Function to find the files from the last 30 days
def f_findfiles():
# First define the cut-off day, which means the last number
# of days which the scritp will consider for the analysis
cut_off_day = datetime.datetime.now() - datetime.timedelta(days=30)
# We'll now loop through all history files from the last 30 days
for file in glob.iglob("/opt/hc/*.txt"):
filetime = datetime.datetime.fromtimestamp(os.path.getmtime(file))
if filetime > cut_off_day:
history_list.append(file)
# Just included the function below to show how I'm going
# through the files, this is where I got stuck...
def f_openfiles(arg):
for file in arg:
with open(file, "r") as file:
for line in file:
clean_line = line.strip().split("@")
# Main function
def main():
f_findfiles()
f_openfiles(history_list)
I'm opening the files using 'with' and reading all the lines from all the files in a 'for', but I'm not sure how I can navigate through the data to compare the value related to one file with the older files.
I've tried putting all the data in a dictionary, on a list, or just enumerating and comparing, but I've failed on all these methods :-(
Any tips on what would be the best approach here? Thank you!
I'd better handle such with shell utilities (i.e uniq), but, as long as you prefer to use python:
With minimal effor, you can handle it creating appropriate dict
object with stings (like 'file_2016_Oct_01.txt@hostname@YES') being the keys.
Iterating over log, you'd check corresponding key exists in dictionary (with if 'file_2016_Oct_01.txt@hostname@YES' in my_log_dict
), then assign or increment dict value appropriately.
A short sample:
data_log = {}
lookup_string = 'foobar'
if lookup_string in data_log:
data_log[lookup_string] += 1
else:
data_log[lookup_string] = 1
Alternatively (one-liner, yet it looks ugly in python most of time, I have had edited it to use line breaks to be visible):
data_log[lookup_string] = data_log[lookup_string] + 1 \
if lookup_string in data_log \
else 1